SHAP Instability Demonstrator
Perturb the feature values of a near-threshold welfare claimant and watch SHAP attributions shift while the predicted score stays approximately constant. Demonstrates that SHAP explanations are properties of a local approximation algorithm — not the model's internal representations.
Predicted risk scoreFLAGGED (≥ τ)
0.500= from baseline 0.500
SHAP attributions
CurrentBaseline
Sum = 0.177 (= score − baseline)
Instability
Δ Score0.0000
Δ SHAP (max feature)0.0000
No significant change yet. Adjust a feature to see instability.
Perturb the claimant's features
Show text description
Predicted score: 0.500 (flagged). Baseline score: 0.500. SHAP values: Age: 0.029, Income score: 0.057, Benefit duration: 0.079, Household size: 0.012. Score change: 0.0000. Maximum SHAP change: 0.0000.