Case-Explainer Documentation
============================

**Case-Explainer** provides model-agnostic explanations through training set precedent and nearest neighbor correspondence.

.. image:: https://img.shields.io/badge/python-3.8+-blue.svg
   :target: https://www.python.org/downloads/
   :alt: Python 3.8+

.. image:: https://img.shields.io/badge/License-MIT-yellow.svg
   :target: https://opensource.org/licenses/MIT
   :alt: License: MIT

Overview
--------

While some explainability methods provide feature importance scores, case-based explainability answers: 
**"Why was this prediction made?"** by showing similar training examples.

Instead of: *"Feature X has importance 0.45"*  

You get: *"This sample is classified as X because it resembles these 5 training examples"*

Key Features
------------

* **Model-agnostic**: Works with any classifier (sklearn, XGBoost, neural networks, etc.)
* **Correspondence metric**: Quantifies agreement between prediction and neighbors
* **Multiple indexing strategies**: K-D Tree, Ball Tree, or brute force
* **Automatic scaling**: Optional feature standardization
* **Metadata tracking**: Attach provenance data to training samples
* **Sklearn-compatible API**: Familiar interface for ML practitioners
* **Batch explanations**: Explain multiple predictions efficiently

Quick Start
-----------

.. code-block:: python

   from case_explainer import CaseExplainer
   from sklearn.datasets import load_iris
   from sklearn.model_selection import train_test_split
   from sklearn.ensemble import RandomForestClassifier

   # Load data
   X, y = load_iris(return_X_y=True)
   X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

   # Train classifier
   clf = RandomForestClassifier()
   clf.fit(X_train, y_train)

   # Create explainer
   explainer = CaseExplainer(
       X_train=X_train,
       y_train=y_train,
       feature_names=['sepal_len', 'sepal_width', 'petal_len', 'petal_width'],
       algorithm='kd_tree'
   )

   # Explain a prediction
   explanation = explainer.explain_instance(X_test[0], k=5, model=clf)
   print(f"Correspondence: {explanation.correspondence:.2%}")
   print(explanation.summary())

Installation
------------

.. code-block:: bash

   # From source (current development version)
   cd case-explainer
   pip install -e .

   # Dependencies
   pip install numpy scipy scikit-learn matplotlib pandas

Performance
-----------

Validated across multiple domains (single runs on reference hardware):

* **Hardware Trojan Detection**: 99.9% average correspondence, 25.7 ms/sample
* **Credit Card Fraud Detection**: 100% average correspondence, 36.4 ms/sample
* **Medical Diagnosis (Breast Cancer)**: 93.3% average correspondence, 25.9 ms/sample
* **Scalability**: Tested up to 200k training samples

**Note on Correspondence**: This metric measures agreement between predictions and retrieved 
neighbors, not prediction accuracy or quality. High correspondence indicates consistency with 
training data patterns.

Contents
--------

.. toctree::
   :maxdepth: 2
   :caption: API Reference

   api/explainer
   api/explanation
   api/metrics

.. toctree::
   :maxdepth: 1
   :caption: Additional Information

   citation
   license

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`