Explainable AI Architectures:

Methods, Applications, Examples, and Results

Case Western Reserve University

Case School of Engineering

Electrical, Computer, and Systems Engineering

Outline

  • Introduction
  • Problem
  • Contributions
  • Background and Related Work
  • Property-Based Explainable (PBE) Method
    • PBE Handwritten Character Results
    • PBE Hardware Trojan Results
  • Case-Based Explainable (CBE) Method
    • CBE Handwritten Character Results
    • CBE Hardware Trojan Results
  • Conclusion
  • Future Work

Introduction

  • Artificial Intelligence (AI) and Machine Learning (ML) used widely
    Applications: Business, Medicine, Transportation
  • Lack of trust in AI - models are often an opaque box
  • Many AI systems cannot effectively explain or justify decisions
  • Explainable AI (XAI)

Problem

  • Neural Networks cannot explain their inferences
  • Therefore, there is a lack of trust in these systems
  • XAI systems are an attempt to explain the inference by a trained NN and increase trust in ML systems
  • An explanation should be in plain terms

Comparison

Contributions


  • PBE Method and Architecture
  • CBE Method and Architecture
  • Explainability metric - $Ex(c)$
  • Combining explainability with unexplainability
  • Metric for Effectiveness of a model - $E(j, c)$
  • A new metric for model performance - $E_{PARS}$
  • Communicating when a model can fail - FDR
  • Correspondence metric - $Corr$

Publications

Background and Related Work


AI Taxonomy - Capability and Functionality

AI Taxonomy - Algorithms and Architectures

Multi-layer Feed Forward Neural Network

Inference but no explanation

1995 - LeNet-5 CNN Neural Network

Major improvement in inference performance but still no explanation

2015 - ResNet

Outstanding performance but cannot explain results

XAI Research

  • 1999 - Case-Based Explanation of Non-Case-Based Learning Methods - Caruana et al.
  • 2018 - Explainable neural networks based on additive index models - Vaughan et al.

XAI Research - LIME

2016 - Why should I Trust you - Marco Tulio Ribeiro et al. - LIME


LIME superpixel mask for classification as a Bernese mountain dog


LIME on Handwritten Digits


XAI Research - Continued

  • 2017 - A Unified Approach to Interpreting Model Predictions Lundberg et al. - SHAP

Confusion Matrix - One Versus Others


Predicted
Classes
a b c d
Actual
Classes
a TN FP TN TN
b FN TP FN FN
c TN FP TN TN
d TN FP TN TN

Legend
True Positives (TP)
True Negatives (TN)
False Positives (FP)
False Negatives (FN)

Performance Metrics



  • Accuracy = $\frac{TP+TN}{TP+TN+FP+FN}$

  • Precision = $\frac{TP}{TP+FP}$

  • Recall = $\frac{TP}{TP+FN}$

  • Specificity = $\frac{TN}{TN+FP}$

  • False Discovery Rate (FDR) = $\frac{FP}{FP+TP}$

Imbalance Ratio (IR)

\[ IR = \frac{N_{maj}}{N_{min}} \]

If IR > 1 the data set is imbalanced


  • AUC
  • F1-Score
  • Cohen's Kappa
  • Matthew's Correlation Coefficient

Property-Based Explainable (PBE) Method

PBE Method


Intent: Produce a system that can explain decisions to a user in plain terms by reasoning about the system's decisions in relation to explainable properties.

Explainable Property: An attribute of an input sample that may differentiate between classes and provide rationale for a classification decision to a user.

Property Transform: a property transformation is a function to modify an input sample to bring out an explainable property in the resulting output. A property transformation aims to highlight or exemplify explainable properties in the input.

PBE Method - Steps

PBE Architecture - Goal of Method

PBE Properties and Transforms - MNIST


Property Transform Image Trans.
Stroke Skeleton
Circle Hough Circle
Circle Hough Ellipse
Circle Multiple Circle and Ellipse
Crossings Intersection
Property Transform Image Trans.
Endpoints Endpoints
Enclosed Region Flood Fill
Enclosed Region Convex Hull
Line Hough Line
Corner Harris Corner

Transform Training Data


Train ML Models


Build Knowledgebase


Voting Scheme

  • Voting: Selecting among potentially conflicting opinions from inference engines.
  • Effectiveness: Characterizes how well an inference engine performs. The effectiveness of an inference engine, $j$, to correctly recognize an item of class $c$ is expressed as $E(j,c)$.

Voting Scheme - Continued



Weighted Effectiveness, $WE(c)$ for a class $c$ is the sum of effectiveness for all IEs, $j$, that voted for $c$

\[ WE(c)=\sum_j E(j, c) \]

Confidence, $Conf(c)$, for a class $c$ is the Weighted Effectiveness of $c$ over the sum of Weighted Effectiveness of all classes that were voted upon

\[ Conf(c)=\frac{WE(c)}{\sum\limits_kWE(k)} \]

Class $c$ with highest confidence wins

Explainability

  • Some properties and transforms have lacking performance
  • Addition of unexplainable inference to improve performance
  • Need a means of quantifying explainability, $Ex(c)$, for a class $c$
  • Each property transform, $j$, has a an explainbility metric $0 \le X_j \le 1$
\[ Ex(c)=\frac{\sum{E(j,c)X_j}}{\sum{E(j,c)}} \]

Explanation Routine - XAI Block

  • Assemble the textual rationale composed of
  • The winning vote with confidence and explainability
  • Present alternatives voted for with confidence and explainability
  • Common failures based on historical FDR

Property-Based Explainability Results

Handwritten Character Datasets


Widely used for BENCHMARKING ML architectures

  • MNIST - 70,000 decimal digit images
  • EMNIST - Over 800,000 digits, uppercase and lowercase characters

  • Dataset balanced among classes
  • Images, therefore high dimensionality or many features

Property Based Architecture
MNIST Aggregate Results


Accuracy

Architecture
ML Model Type 1 Unexpl. 10 Expl. 10 Expl.
1 Unexpl.
MLP 98.3 96.2 97.9
SVM 97.9 95.4 97.3
CNN 99.4 97.3 98.7
Resnet50 98.9 97.6 98.8

Average Explainability

Architecture
ML Model Type 1 Unexpl. 10 Expl. 10 Expl.
1 Unexpl.
MLP 0.0 100 67.2
SVM 0.0 100 76.8
CNN 0.0 100 75.5
Resnet50 0.0 100 69.9

Unexplainable (Unexp.)

Explainable (Expl.)

Explainable + Unexplainable

MNIST Explainable Results: Digit

Effectiveness Explainability
Fj Property Vote $E(j,0)$ $E(j,4)$ $E(j,9)$ $Ex(0)$ $Ex(4)$ $Ex(9)$
F1 Stroke 4 1.0 1.0
F2 Circle 0 0.039 1.0
F3 Crossing 0 0.018 1.0
F4 Ellipse 0 0.004 1.0
F5 Ell-Cir 0 0.069 1.0
F6 Endpoint 4 0.974 1.0
F7 Enc. Reg. 0 0.021 1.0
F8 Line 9 0.496 1.0
F9 Con. Hull 4 0.826 1.0
F10 Corner 4 0.538 1.0
F11 Unexp. 4 1.0 0.0
$WE(c)$ / $\sum{E(j,c)X_j}$ 0.151 4.337 0.496 0.151 3.337 0.496
Confidence/Expl 3.03% 87.0% 9.96% 100.0% 76.9% 100%

PBE - Response for Digit

  • Confidence is high, 87%, for interpreting this character as a four due to the stroke, endpoint, convex hull, and corner properties. Explainability was 76.9%. The FDR shows when selecting a four, 1.9% of the time we are incorrect. The most frequent mistakes are that the digit is a nine 0.9% and a seven 0.3% of the cases.

PBE - Alternatives for Digit

  • Confidence is low, 9.96%, for interpreting this character as a nine due to the line property. Explainability was 100%. The FDR shows when selecting a nine, 2.6% of the time we are incorrect. The most frequent mistakes are that the digit is a four 1.4% of the time and an eight 0.5% of the cases.
  • Confidence is low, 3.03% for interpreting this character as a zero due to the ellipse-circle, circle, fill, crossing, and ellipse properties. Explainability was 100%. he FDR shows when selecting a zero, 1.4% of the time we are incorrect. The most frequent mistake is that the digit is an eight 0.6% of the time.

EMNIST Aggregate Results


Unexplainable Benchmark
Explainable
Explainable + Unexplainable
Explainable + Unexplainable

EMNIST Explainable Results: Character

Effectiveness Explainability
Fj Property Vote $E(j,C)$ $E(j,T)$ $E(j,U)$ $E(j,X)$ $Ex(C)$ $Ex(T)$ $Ex(U)$ $Ex(X)$
F1 Stroke C 0.964 1.0
F2 Circle C 0.114 1.0
F3 Crossing C 0.056 1.0
F4 Ellipse T 0.009 1.0
F5 Ell-Cir C 0.131 1.0
F6 Endpoint C 0.574 1.0
F7 Enc. Reg. X 0.005 1.0
F8 Line U 0.244 1.0
F9 Con. Hull C 0.603 1.0
F10 Corner C 0.369 1.0
F11 Unexp. C 0.989 0.0
$WE(c)$ / $\sum{E(j,c)X_j}$ 3.801 0.009 0.244 0.005 2.812 0.009 0.244 0.005
Confidence/Expl 73.6% 0.02% 6.02% 0.01% 74.0% 100% 100% 100%

Metrics and PBE


$E_{PARS}$ as Effectiveness



\[ E_{PARS} = P \cdot ACC \cdot R \cdot S \\ \]

$\frac{TN {\cdot} TP^3+TN^2 {\cdot} TP^2}{(TN{+}FP)(TP{+}FP)(TP{+}FN)(TP{+}TN{+}FP{+}FN)}$

Performance of Metrics as Effectiveness on Handwriting

Hardware Trojans

Rare Event Hardware Trojan


Problem

  • Static trojan detection using netlist features
    1. LGFi - Logic gate fanin
    2. FFi - Flip-flop input
    3. FFo - Flip-flop output
    4. PI - Primary input
    5. PO - Primary output
  • Highly imbalanced dataset
  • ML trained to make decisions
  • Trust in the decisions is lacking - Need Explanations

Hardware Trojan Results

Hardware Trojan - Data Processing


Dataset Characterization

  • 15 Trust-hub netlists - 52k entries
  • Five Features
  • Two Classes: Trojan and Non-Trojan
  • Highly Imbalanced data
  • Trojan: Non-Trojan - 1:250
  • Many Duplicates

Training and Test

  • 80% used for training
  • 20% used for test

PBE Architecture - Trojans

Property = grouping of features

PBE Example

Sample


Output

Properties

Case-Based Explainable (CBE) Method

CBE Method


Intent: Explain decisions by providing evidence about similar training cases.

Inspiration: Work by Caruana et al. Case-based explanation of non-case-based learning methods.

Consider training samples as cases precedent. Similar training cases should support a decision.

CBE is not explaining the model behavior, but what was used to train the model that is similar to an input.

CBE Method - Steps



CBE - Steps Detail

Train ML Model
Training Index

Query Scheme

Explanation Routine



Weight of Neighbors and Correspondence

$WN(c) = \sum_{i=1}^{c_i \in k} \frac{bf(c)}{(d_i+1.0)^2}$

$Corr(c) = \frac{WN(c)}{\sum_{j=1}^{c_j \in k}{WN(c_j)}}$

CBE Architecture - Handwriting Results


Aggregate - Correspondence - 97.7%

CBE Architecture - Results for Digit



SVM Prediction: four

Correspondence = 92.3%

Alternatives: nine with 7.7% correspondence

CBE Architecture - Hardware Trojan Results


Aggregate - Correspondence - 97.4%

CBE Example

Sample


Output

SVM - Trojan

Conclusions

Conclusions - Method Strengths

  • PBE worked well on explainaing handwritten digits
  • PBE is well suited for high dimensional datasets
  • CBE worked well on explaining both handwriting and HW trojans
  • CBE accuracy could be among best

Conclusions - Weaknesses

  • Explainable properties are difficult to elicit from low dimensional space
  • Marginal explainability in PBE architecture with trojans
  • Much more involved to implement the PBE method
  • Takes longer for CBE to execute due to searcing for neighbor cases

Conclusion - Continued

Important User Questions Driving XAI:


We successfully addressed these questions with evidence in the links above.

Final Conclusions

  • CBE outperformed PBE - better accuracy and explanations
  • Examples of how to answer all of the importent user questions
  • Research in four published papers with contributions
    • Two explainable methods
    • Effectiveness and new EPARS metric
    • Confidence metric from PBE decisions
    • Quantifying exlainability with mix of explainability and unexplainability
    • Correspondence between neighbors in CBE
    • When the system can fail with FDR

Future Work

  • More applications for the methods
  • Generalizing the property based method
  • Scaling the case-based method to larger datasets
  • Expanding the explainable interface for user questions/interrogation.
    Large Language Models on the knowledgebase or training index

Live Examples of Explainable MNIST Recognition