Credit Card Approval Prediction: A comparative analysis between Logistic Regression, KNN, Decision Trees, Random Forest, XGBoost
2024 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE credits
Student thesis
Abstract [en]
Background: In recent years, the financial industry has experienced significantgrowth in credit card applications. The process of evaluating these applications iscritical as it determines the creditworthiness of applicants, affecting both the financialstability of the lending institutions and the economic opportunities for individuals.Traditional methods often fall short in effectively handling the complexity and vol-ume of modern credit data, suggesting a need for more sophisticated analytical tools.
Objectives: The primary goal is to assess the impact of Gaussian noise on therobustness and performance of various machine learning models in predicting creditcard approval. This study aims to identify which model maintains the highest accu-racy and generalization capability in the presence of noise-induced complexities.
Method: The study utilized a dataset of credit card applications, augmentedwith Gaussian noise to model real-world uncertainties. Key machine learning mod-els—Logistic Regression, K-Nearest Neighbors (KNN), Decision Trees, Random For-est, and XGBoost—were evaluated. The models were processed through stages ofdata normalization, handling class imbalance, and encoding categorical features, en-suring they are fit for robust analysis.
Results: The addition of Gaussian noise showed varying impacts across differ-ent models. The Random Forest and XGBoost models exhibited superior resilience,maintaining higher accuracy, precision, recall, and F1-scores compared to other mod-els under noise conditions. Specifically, the Random Forest model achieved the high-est accuracy of 53.79% and 52.80% on unmodified and modified datasets, respectively.
Conclusion: The findings suggest that integrating Gaussian noise into the datasetcan effectively enhance the robustness of credit scoring models against real-worlddata variations. Random Forest and XGBoost were identified as the most effectivemodels, offering a promising approach for financial institutions aiming to improvetheir credit decision processes.
Place, publisher, year, edition, pages
2024. , p. 39
Keywords [en]
Machine Learning, Credit Card, Approval Prediction, Prediction Model, Logistic Regression, KNN, Decision Trees, Random Forest, XGBoost
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:bth-26717OAI: oai:DiVA.org:bth-26717DiVA, id: diva2:1883598
Subject / course
DV1478 Bachelor Thesis in Computer Science
Educational program
DVGDT Bachelor Qualification Plan in Computer Science 60.0 hp
Supervisors
Examiners
2024-08-062024-07-102025-09-30Bibliographically approved