SHAP Instability Demonstrator

Perturb the feature values of a near-threshold welfare claimant and watch SHAP attributions shift while the predicted score stays approximately constant. Demonstrates that SHAP explanations are properties of a local approximation algorithm — not the model's internal representations.

Predicted risk scoreFLAGGED (≥ τ)
0.500= from baseline 0.500
SHAP attributions
CurrentBaseline
Age+0.029
Income score+0.057
Benefit duration+0.079
Household size+0.012
Sum = 0.177 (= score − baseline)

Instability

Δ Score0.0000
Δ SHAP (max feature)0.0000

No significant change yet. Adjust a feature to see instability.

Perturb the claimant's features

0 from baseline
0 from baseline
0 from baseline
0 from baseline
Show text description

Predicted score: 0.500 (flagged). Baseline score: 0.500. SHAP values: Age: 0.029, Income score: 0.057, Benefit duration: 0.079, Household size: 0.012. Score change: 0.0000. Maximum SHAP change: 0.0000.