Credit Scoring Model
Project Goal
Dataset Overview
Feature Engineering
Models and Metrics
Optimal cut off by KS
Optimal cut off by Profit
Ensemble cut-off profit dependency
Ensemble profit approval rate dependency
Credit score visualization by proba
What can be improved
Key Takeaways
Thank You
823.65K

Credit Scoring Final Project

1. Credit Scoring Model

In Microfinance Case
Data Science 4.0
Final Project
13.02.2026
Murataliev Ilias
Bishkek, Kyrgyzstan

2. Project Goal

• Build a credit scoring
model to predict
default
• Optimize cut-off level

3. Dataset Overview

• ~10,000 credit contracts
• Target: Default (1) / Non-default (0)
• Demographic, geographic and loan features
• Target balance 20/80

4. Feature Engineering

• Age instead of birth date
• Dates converted to months
• Seasonality: issue month and quarter
birth_date gender
age
issue_da
repaymen loan_amo interest loan_pur credit_p collater
region district
te
unt
_rate
pose
roduct al_type target
end_date t_type
issue_mon issue_quart
th
er
term_month
s

5. Models and Metrics

Models tested:
• Logistic Regression
• LightGBM
• CatBoost (with
optuna)
• XGBoost (with
optuna)
Metrics:
• AUC, Recall,
Precision
Data transform:
• One Hot Encoder
• Standard Scaler
Train/Test split:
• 0.7/0.3

6. Optimal cut off by KS

Logistic
Regression
AUC
XGBoost
CatBoost
LightGBM
Essemble
0.8143
0.8729
0.8736
0.8656
0.8747
Precision
(0/1)
0.94/0.48
0.95/0.55
0.95/0.56
0.93/0.57
0.94/0.57
Recall
(0/1)
0.77/0.81
0.83/0.83
0.84/0.81
0.82/0.81
0.85/0.80
F1 (0/1)
0.85/0.60
0.88/0.66
0.89/0.66
0.88/0.64
0.89/0.67
KS
0.58
0.65
0.65
0.62
0.65
Cut off
0.568
0.218
0.223
0.408
0.300

7. Optimal cut off by Profit

Assumptions:
• Loss on default = 100% of loan amount
• Income = All annuity payments – loan
amount
• Cut-off selected by maximizing profit
Logistic
Regression
XGBoost
CatBoost
LightGBM
Essemble
Max Profit
3.578
3.633
3.585
3.593
3.642
Cut-off
0.570
0.332
0.332
0.693
0.399
Approval
rate
0.657
0.756
0.754
0.785
0.758

8. Ensemble cut-off profit dependency

9. Ensemble profit approval rate dependency

10. Credit score visualization by proba

Probability of default
Grade
Score
Actions
Description
0.0 – 0.05
A
750 – 850
Auto-approval
Ideal clients. Minimal risk.
0.05 – 0.15
B
650 – 749
Approval
Good clients, minor issues.
0.15 – 0.35
C
550 – 649
Additional check
Grey zone. Additional
check.
0.35 – 0.60
D
450 – 549
More like a refusal
High risk. Can be approved
at a max allowed interest
rate.
> 0.60
E
300 – 449
Auto-refusal
Clients with a high
probability of default.

11. What can be improved

• More features in dataset: History of overdue,
Credit history, Employment info etc.
• More detail analyze from financial perspective
Cut-off criteria

12. Key Takeaways

• Gradient Boosting models showed best
performance
• Business-driven cut-off differs from KS cut-off
• Model is applicable for MFOs in Kyrgyzstan
• Solution can be scaled and deployed

13. Thank You

English     Русский Правила