SHAP (SHapley Additive exPlanations)
Overview
SHAP is a unified approach to explain machine learning model outputs using Shapley values from cooperative game theory. This skill provides comprehensive guidance for:
- Computing SHAP values for any model type
- Creating visualizations to understand feature importance
- Debugging and validating model behavior
- Analyzing fairness and bias
- Implementing explainable AI in production
SHAP works with all model types: tree-based models (XGBoost, LightGBM, CatBoost, Random Forest), deep learning models (TensorFlow, PyTorch, Keras), linear models, and black-box models.
When to Use This Skill
Trigger this skill when users ask about:
- "Explain which features are most important in my model"
- "Generate SHAP plots" (waterfall, beeswarm, bar, scatter, force, heatmap, etc.)
- "Why did my model make this prediction?"
- "Calculate SHAP values for my model"
- "Visualize feature importance using SHAP"
- "Debug my model's behavior" or "validate my model"
- "Check my model for bias" or "analyze fairness"
- "Compare feature importance across models"
- "Implement explainable AI" or "add explanations to my model"
- "Understand feature interactions"
- "Create model interpretation dashboard"
Quick Start Guide
Step 1: Select the Right Explainer
Decision Tree:
-
Tree-based model? (XGBoost, LightGBM, CatBoost, Random Forest, Gradient Boosting)
- Use
shap.TreeExplainer (fast, exact)
-
Deep neural network? (TensorFlow, PyTorch, Keras, CNNs, RNNs, Transformers)
- Use
shap.DeepExplainer or shap.GradientExplainer
-
Linear model? (Linear/Logistic Regression, GLMs)
- Use
shap.LinearExplainer (extremely fast)
-
Any other model? (SVMs, custom functions, black-box models)
- Use
shap.KernelExplainer (model-agnostic but slower)
-
Unsure?
- Use
shap.Explainer (automatically selects best algorithm)
See references/explainers.md for detailed information on all explainer types.
Step 2: Compute SHAP Values
import shap
import xgboost as xgb
model = xgb.XGBClassifier().fit(X_train, y_train)
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)
Step 3: Visualize Results
For Global Understanding (entire dataset):
shap.plots.beeswarm(shap_values, max_display=15)
shap.plots.bar(shap_values)
For Individual Predictions:
shap.plots.waterfall(shap_values[0])
shap.plots.force(shap_values[0])
For Feature Relationships:
shap.plots.scatter(shap_values[:, "Feature_Name"])
shap.plots.scatter(shap_values[:, "Age"], color=shap_values[:, "Education"])
See references/plots.md for comprehensive guide on all plot types.
Core Workflows
This skill supports several common workflows. Choose the workflow that matches the current task.
Workflow 1: Basic Model Explanation
Goal: Understand what drives model predictions
Steps:
- Train model and create appropriate explainer
- Compute SHAP values for test set
- Generate global importance plots (beeswarm or bar)
- Examine top feature relationships (scatter plots)
- Explain specific predictions (waterfall plots)
Example:
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)
shap.plots.beeswarm(shap_values)
shap.plots.scatter(shap_values[:, "Most_Important_Feature"])
shap.plots.waterfall(shap_values[0])
Workflow 2: Model Debugging
Goal: Identify and fix model issues
Steps:
- Compute SHAP values
- Identify prediction errors
- Explain misclassified samples
- Check for unexpected feature importance (data leakage)
- Validate feature relationships make sense
- Check feature interactions
See references/workflows.md for detailed debugging workflow.
Workflow 3: Feature Engineering
Goal: Use SHAP insights to improve features
Steps:
- Compute SHAP values for baseline model
- Identify nonlinear relationships (candidates for transformation)
- Identify feature interactions (candidates for interaction terms)
- Engineer new features
- Retrain and compare SHAP values
- Validate improvements
See references/workflows.md for detailed feature engineering workflow.
Workflow 4: Model Comparison
Goal: Compare multiple models to select best interpretable option
Steps:
- Train multiple models
- Compute SHAP values for each
- Compare global feature importance
- Check consistency of feature rankings
- Analyze specific predictions across models
- Select based on accuracy, interpretability, and consistency
See references/workflows.md for detailed model comparison workflow.
Workflow 5: Fairness and Bias Analysis
Goal: Detect and analyze model bias across demographic groups
Steps:
- Identify protected attributes (gender, race, age, etc.)
- Compute SHAP values
- Compare feature importance across groups
- Check protected attribute SHAP importance
- Identify proxy features
- Implement mitigation strategies if bias found
See references/workflows.md for detailed fairness analysis workflow.
Workflow 6: Production Deployment
Goal: Integrate SHAP explanations into production systems
Steps:
- Train and save model
- Create and save explainer
- Build explanation service
- Create API endpoints for predictions with explanations
- Implement caching and optimization
- Monitor explanation quality
See references/workflows.md for detailed production deployment workflow.
Key Concepts
SHAP Values
Definition: SHAP values quantify each feature's contribution to a prediction, measured as the deviation from the expected model output (baseline).
Properties:
- Additivity: SHAP values sum to difference between prediction and baseline
- Fairness: Based on Shapley values from game theory
- Consistency: If a feature becomes more important, its SHAP value increases
Interpretation:
- Positive SHAP value β Feature pushes prediction higher
- Negative SHAP value β Feature pushes prediction lower
- Magnitude β Strength of feature's impact
- Sum of SHAP values β Total prediction change from baseline
Example:
Baseline (expected value): 0.30
Feature contributions (SHAP values):
Age: +0.15
Income: +0.10
Education: -0.05
Final prediction: 0.30 + 0.15 + 0.10 - 0.05 = 0.50
Background Data / Baseline
Purpose: Represents "typical" input to establish baseline expectations
Selection:
- Random sample from training data (50-1000 samples)
- Or use kmeans to select representative samples
- For DeepExplainer/KernelExplainer: 100-1000 samples balances accuracy and speed
Impact: Baseline affects SHAP value magnitudes but not relative importance
Model Output Types
Critical Consideration: Understand what your model outputs
- Raw output: For regression or tree margins
- Probability: For classification probability
- Log-odds: For logistic regression (before sigmoid)
Example: XGBoost classifiers explain margin output (log-odds) by default. To explain probabilities, use model_output="probability" in TreeExplainer.
Common Patterns
Pattern 1: Complete Model Analysis
explainer = shap.TreeExplainer(model)
shap_values = explainer(X_test)
shap.plots.beeswarm(shap_values)
shap.plots.bar(shap_values)
top_features = X_test.columns[np.abs(shap_values.values).mean(0).argsort()[-5:]]
for feature in top_features:
shap.plots.scatter(shap_values[:, feature])
for i in range(5):
shap.plots.waterfall(shap_values[i])
Pattern 2: Cohort Comparison
cohort1_mask = X_test['Group'] == 'A'
cohort2_mask = X_test['Group'] == 'B'
shap.plots.bar({
"Group A": shap_values[cohort1_mask],
"Group B": shap_values[cohort2_mask]
})
Pattern 3: Debugging Errors
errors = model.predict(X_test) != y_test
error_indices = np.where(errors)[0]
for idx in error_indices[:5]:
print(f"Sample {idx}:")
shap.plots.waterfall(shap_values[idx])
shap.plots.scatter(shap_values[:, "Suspicious_Feature"])
Performance Optimization
Speed Considerations
Explainer Speed (fastest to slowest):
LinearExplainer - Nearly instantaneous
TreeExplainer - Very fast
DeepExplainer - Fast for neural networks
GradientExplainer -