Journal of Advances in Developmental Research

E-ISSN: 0976-4844     Impact Factor: 9.71

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 17 Issue 1 January-June 2026 Submit your research before last 3 days of June to publish your research paper in the issue of January-June.

Operationalizing Imbalance: Boundary vs. Density-Based Sampling in Extreme and Moderate Credit Risk Scenarios

Author(s) Sai Prashanth Pathi
Country United States
Abstract Machine learning models in credit risk are frequently compromised by the class imbalance problem, where fraud cases or defaults represent a negligible fraction of the population. While the Synthetic Minority Over-sampling Technique (SMOTE) is the de facto standard for addressing this, recent literature suggests it may introduce noise and computational overhead without operational gain. This study conducts a rigorous comparative analysis of six strategies: Cost-Sensitive Learning (Baseline), Random Undersampling (RUS), Vanilla SMOTE, Borderline-SMOTE, ADASYN, and SMOTE-ENN applied to Gradient Boosted Decision Trees (XGBoost). The evaluation utilizes three datasets with varying imbalance ratios (from 0.17% to 6.9%) to test robustness across different financial contexts. Our experiments reveal two critical insights. First, we identify a "False Positive Trap" in extreme imbalance scenarios: while Vanilla SMOTE achieved the highest Area Under Precision-Recall Curve (AUPRC: 0.825), it degraded the F1-score to 0.282, rendering it operationally inviable. In contrast, Borderline-SMOTE maintained a comparable AUPRC (0.818) while achieving a superior F1-score (0.690). Second, we detect a "Threshold of Necessity": in scenarios with moderate imbalance (>5%), all sampling techniques failed to outperform the cost-sensitive baseline, suggesting that synthetic sampling is counter-productive when sufficient minority examples exist.
Keywords Credit Risk, Class Imbalance, Fraud Detection, XGBoost, SMOTE, Anomaly Detection.
Field Engineering
Published In Volume 16, Issue 2, July-December 2025
Published On 2025-12-12
Cite This Operationalizing Imbalance: Boundary vs. Density-Based Sampling in Extreme and Moderate Credit Risk Scenarios - Sai Prashanth Pathi - IJAIDR Volume 16, Issue 2, July-December 2025. DOI 10.71097/IJAIDR.v16.i2.1691
DOI https://doi.org/10.71097/IJAIDR.v16.i2.1691
Short DOI https://doi.org/hbphzs

Share this