Decision Threshold Explorer

Adjust the decision threshold τ of a logistic regression welfare scoring model and observe the resulting TPR, FPR, precision, and F1. A sub-group comparison panel reveals how the same τ produces different error rates for majority (Group A) and minority (Group B) claimants — the sociotechnical mechanism of algorithmic disparity.

True Pos. Rate80.0%64 of 80 high-risk caught
False Pos. Rate17.7%39 of 220 low-risk wrongly flagged
Precision62.1%of flagged claimants who are truly high-risk
F1 score69.9%harmonic mean of TPR and Precision

ROC curve

Group A vs Group B

Group B has a higher true positive base rate (40% vs 20%) but worse model calibration — reflecting label bias.

TPR (sensitivity)
Group A
92.5%
Group B
67.5%
FPR (false alarms)
Group A
13.1%
Group B
30.0%

Score distribution — green = genuinely high-risk · blue = genuinely low-risk

Show text description

Decision threshold τ = 0.50. Dataset: 300 synthetic claimants (200 Group A, 100 Group B). Model AUC = 0.909. Overall: TPR 80.0%, FPR 17.7%, Precision 62.1%, F1 69.9%. Group A: TPR 92.5%, FPR 13.1%. Group B: TPR 67.5%, FPR 30.0%.