Performance Analysis of Ensemble Learning and Feature Selection Methods in Loan Approval Prediction at Banks

Authors

  • Iqbal Muhammad Universitas Bina Sarana Informatika
  • Rizka Dahlia Universitas Bina Sarana Informatika
  • Muhammad Ifan Rifani Ihsan Universitas Nusa Mandiri
  • Lisnawanty Universitas Bina Sarana Informatika
  • Rabiatus Sa’adah Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.59934/jaiea.v3i2.426

Keywords:

Ensemble learning, Feature selection, Prediction, Loan Approval

Abstract

Applying for a loan at a bank has a series of relevant assessments based on data and credit scores in determining a borrower's eligibility to receive a loan from the bank. Machine learning is the basis for evaluating whether an individual is worthy of obtaining a loan, in order to reduce the potential risks faced by banks. This research aims to obtain the best accuracy value from the Loan Approval Prediction dataset which is sourced from the open dataset provider website, namely Kaggle. This Loan Approval Prediction dataset has 14 features with 4,269 data. The results of dataset analysis carried out on 4,269 data showed that the amount of data that could be studied was 4,173 data (2,599 data were approved and 1,574 data were rejected). The results of the feature importance evaluation on 14 features show that loan amount is the most important feature compared to other features, while bank asset value is the feature that has the lowest influence. Research on the Loan Approval Prediction dataset was also carried out by testing several Decision Tree ensemble models, including Extreme Gradient Boosting or XGBoost, Light Gradient Boosting Machine (Light GBM), Gradient Boosting, Random Forest, Adaptive Boosting (Adaboost) and Extra Trees. The comparison results show that the XGBoost (Extreme Gradient Boosting) model is the best model, with Accuracy 0.9974, AUC 0.9998, Recall 0.9963, Prec 0.9969, F1 0.9966.

Downloads

Download data is not yet available.

References

D. A. Saputri and R. K. Dewi, “PENGARUH PENDAPATAN TERHADAP PEMBIAYAAN GADAI (RAHN) PADA PEGADAIAN SYARIAH WAY HALIM 2016-2018,” J. BISNIS Akunt. UNSURYA, 2020, doi: 10.35968/jbau.v5i2.433.

P. Rahmawati, A. Larasati, and M. Marsono, “PENGEMBANGAN MODEL PERSETUJUAN KREDIT NASABAH BANK DENGAN ALGORITMA KLASIFIKASI NAÏVE BAYES, DECISION TREE, DAN ARTIFICIAL NEURAL NETWORK,” J@ti Undip J. Tek. Ind., 2022, doi: 10.14710/jati.1.1.1-12.

H. Huzain, A. D. Saputri, S. R. Rasyid, and M. M. Tsani, “Studi Kelayakan Bisnis Aspek finansial,” Uin Alauddin Makassar, 2021.

S. Mutmainah and R. P. Putra, “PROSEDUR PENDAFTARAN PINJAMAN KREDIT USAHA RAKYAT (KUR) PADA PT. BANK RAKYAT INDONSIA (Persero) Tbk. UNT TANJUNGSARI CABANG PAMANUKAN,” World Financ. Adm. J., 2022, doi: 10.37950/wfaj.v4i1.1341.

Y. Zhou, “Loan Default Prediction Based on Machine Learning Methods,” in Proceedings of the 3rd International Conference on Big Data Economy and Information Management, BDEIM 2022, December 2-3, 2022, Zhengzhou, China, EAI, 2023. doi: 10.4108/eai.2-12-2022.2328740.

Mia Muchia Desda and Mai Yuliza, “Pengaruh Risiko Kredit Terhadap Likuiditas Melalui Perputaran Piutang Pada PT. BPR Swadaya Anak Nagari,” WACANA Ekon. (Jurnal Ekon. Bisnis dan Akuntansi), 2021, doi: 10.22225/we.20.2.2021.161-169.

E. Octavia, “ANALISIS PROSES PEMBERIAN KREDIT UNTUK MENGURANGI RESIKO KREDIT MACET DI PT. BANK ARTHA GRAHA INTERNASIONAL TBK BANDUNG.,” J. Akunt. Bisnis dan Ekon., 2021, doi: 10.33197/jabe.vol6.iss2.2020.622.

M. Murtala, “ANALISIS RISIKO KREDIT USAHA PADA NASABAH PT. BANK BRI (STUDI KASUS BRI UNIT T. NYAK ARIEF),” J. Ekon. dan Pembang., 2020, doi: 10.22373/jep.v11i1.72.

P. Ziemba, J. Becker, A. Becker, A. Radomska-Zalas, M. Pawluk, and D. Wierzba, “Credit decision support based on real set of cash loans using integrated machine learning algorithms,” Electron., 2021, doi: 10.3390/electronics10172099.

P. Bhargav and K. Sashirekha, “A Machine Learning Method for Predicting Loan Approval by Comparing the Random Forest and Decision Tree Algorithms .,” vol. 10, pp. 1803–1813, 2023, doi: https://doi.org/10.17762/sfs.v10i1S.414.

P. Bhargav and K. Malathi, “Using Machine Learning , the Random Forest Algorithm and Logistic Regression to Predict Default Loan Approval .,” vol. 10, pp. 1814–1824, 2023, doi: https://doi.org/10.17762/sfs.v10i1S.415.

P. Bhargav and P. Rama Parvathy, “Comparing Random Forest with the Naive Bayes Algorithm with Improved Accuracy: An Effective Machine Learning Method for Loan Prediction,” J. Surv. Fish. Sci., vol. 10, no. 1S, pp. 2018–2029, 2023, [Online]. Available: http://sifisheriessciences.com/journal/index.php/journal/article/view/436

K. Sravani and R. Mahaveerakannan, “Using Random Forest as a Novel Approach to Loan Prediction and Comparing Accuracy to the Support Vector Machine Algorithm,” vol. 10, pp. 1174–1181, 2023.

F. Chollet, “Loan Approval Prediction Using Machine Learning,” Mach. Learn., vol. 45, no. 13, pp. 40–48, 2017, [Online]. Available: https://books.google.ca/books?id=EoYBngEACAAJ&dq=mitchell+machine+learning+1997&hl=en&sa=X&ved=0ahUKEwiomdqfj8TkAhWGslkKHRCbAtoQ6AEIKjAA

Y. Dasari, K. Rishitha, and O. Gandhi, “Prediction of Bank Loan Status Using Machine Learning Algorithms,” Int. J. Comput. Digit. Syst., vol. 14, no. 1, pp. 139–146, 2023, doi: 10.12785/ijcds/140113.

N. M. Aji, V. Atina, and N. A. Sudibyo, “View of PEMODELAN PREDIKSI KELULUSAN MAHASISWA DENGAN METODE NAÏVE BAYES DI UNIBA.pdf.” pp. 148–158, 2023.

M. Ula, A. F. Ulva, and Mauliza, “View of IMPLEMENTASI MACHINE LEARNING DENGAN MODEL CASE BASED REASONING DALAM MENDAGNOSA GIZI BURUK PADA ANAK.pdf.” pp. 333–339, 2021.

D. Cahya Putri Buani, “Penerapan Algoritma Naïve Bayes dengan Seleksi Fitur Algoritma Genetika Untuk Prediksi Gagal Jantung,” EVOLUSI J. Sains dan Manaj., vol. 9, no. 2, pp. 43–48, 2021, doi: 10.31294/evolusi.v9i2.11141.

A. N. Syahrudin and T. Kurniawan, “Input dan Output pada Bahasa Pemrograman Python,” J. Dasar Pemrograman Python STMIK, no. June 2018, pp. 1–7, 2018.

A. C. Nugraha and M. I. Irawan, “Komparasi Deteksi Kecurangan pada Data Klaim Asuransi Pelayanan Kesehatan Menggunakan Metode Support Vector Machine (SVM) dan Extreme Gradient Boosting (XGBoost),” J. Sains dan Seni ITS, vol. 12, no. 1, 2023, doi: 10.12962/j23373520.v12i1.107032.

F. I. Kurniadi and P. D. Larasati, “Light Gradient Boosting Machine untuk Deteksi Penyakit Stroke,” J. SISKOM-KB (Sistem Komput. dan Kecerdasan Buatan), vol. 6, no. 1, pp. 67–72, 2022, doi: 10.47970/siskom-kb.v6i1.328.

E. Ismanto and M. Novalia, “dan Gradient Boosting untuk Klasifikasi Komoditas Performance Comparison Between C4.5 Algorithm, Random Forests, and Gradient Boosting for Commodity Classification,” Agustus, vol. 20, no. 3, pp. 400–410, 2021.

Yoga Religia, Agung Nugroho, and Wahyu Hadikristanto, “Klasifikasi Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 187–192, 2021, doi: 10.29207/resti.v5i1.2813.

Qadrini L, Sepperwali A, and Aina A, “Decision Tree Dan Adaboost Pada Klasifikasi Penerima Program Bantuan Sosial,” J. Inov. Penelit., vol. 2, no. 7, pp. 1959–1966, 2021.

Downloads

Published

2024-02-15

How to Cite

Muhammad, I., Dahlia, R., Muhammad Ifan Rifani Ihsan, Lisnawanty, & Rabiatus Sa’adah. (2024). Performance Analysis of Ensemble Learning and Feature Selection Methods in Loan Approval Prediction at Banks. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 3(2), 557–564. https://doi.org/10.59934/jaiea.v3i2.426