Aim: Credit risk classification is a critical aspect of financial decision-making, influencing loan approvals and risk management strategies. Traditional credit scoring models rely on statistical techniques, which often fail to capture complex patterns in financial data. This study presents a machine learning-based approach for credit risk classification using the German Credit Dataset. By applying feature selection techniques, we aim to enhance predictive accuracy and computational efficiency, addressing the challenges posed by high-dimensional data.
Methods: The study employs Decision Tree and Naïve Bayes classifiers to evaluate creditworthiness. A feature selection process using ANOVA and Chi-Square tests was implemented to remove low-correlation attributes. The dataset, comprising 1,000 records and 21 attributes, was preprocessed to improve model efficiency. An 80-20 train-test split and 5-fold cross-validation were used to ensure robust performance evaluation.
Results: Feature selection significantly impacted model accuracy. The Decision Tree classifier's accuracy improved from 70% to 74% after removing four weakly correlated attributes, while the Naïve Bayes classifier showed a modest increase from 63% to 64%. Evaluation metrics, including precision, recall, and F1-score, confirmed that refining predictors enhances classification reliability. Comparative analysis highlights the Decision Tree's robustness in handling feature dependencies.
Conclusion: This study demonstrates that feature selection enhances the predictive accuracy of credit risk classification models. The Decision Tree classifier benefited the most from dimensionality reduction, while Naïve Bayes showed limited improvement due to its assumption of feature independence. Future research should explore ensemble methods, such as Random Forest and XGBoost, to further enhance credit risk assessment models. Addressing dataset imbalance and integrating fairness-aware algorithms could contribute to more equitable financial decision-making.
Key words: Credit Card, Machine Learning, Naive Bayes, Decision Trees
|