Model Drift and Data Quality Dashboard (Credit Card Default, UCI Dataset, 2005)

This dashboard visualizes real model performance and data quality metrics using the UCI Default of Credit Card Clients dataset (30,000 clients, Taiwan, 2005)[1][2][4][6]. It is designed to monitor model health and data integrity in credit risk modeling.
  • Model Accuracy: Logistic regression achieves 0.819 accuracy (10-fold cross-validation)[2][4].
  • Feature Drift: Shows mean difference in selected features between clients who defaulted and those who did not.
  • Pipeline Data Quality: Reports actual missing values and outlier rates in key variables.
Source: UCI ML Repository (Yeh & Lien, 2009)
Model Accuracy (Logistic Regression):
  • 10-fold cross-validation: 0.819 (Yeh & Lien, 2009)[2][4].
  • Other models: SVM (0.817), Decision Tree (0.818), Random Forest (0.819)[4].
Source: Yeh & Lien, 2009; GitHub analysis[4][6]
Feature Means by Default Status (UCI Dataset):
  • LIMIT_BAL: Credit limit is much lower among defaulters.
  • AGE: Defaulters are slightly younger on average.
  • PAY_0: Defaulters have higher recent payment delays.
Source: UCI dataset summary statistics[2][4][6]
Data Quality: Missing and Outlier Rates
  • Missing values are extremely rare (0% in all key columns).
  • Outlier rates (values outside 3 standard deviations) shown for select variables.
Source: UCI dataset EDA[4][6]
All figures are computed from the actual UCI Default of Credit Card Clients dataset. Use this dashboard to benchmark model and data pipeline health in real-world credit risk modeling.
Data: UCI ML Repository (Yeh & Lien, 2009)[2][1]. See also: GitHub Analysis[4].

Model Drift and Data Quality Dashboard (2025)