IMPLEMENTASI METODE SMOTE DAN RANDOM OVER-SAMPLING PADA ALGORITMA MACHINE LEARNING UNTUK PREDIKSI CUSTOMER CHURN DI SEKTOR PERBANKAN
Abstract
The ability to anticipate unsubscribed customers is a challenge in the competitive banking industry, where it is more efficient to retain customers than to attract new ones. The purpose of this study is to improve the effectiveness of churn prediction by overcoming data imbalances using SMOTE (Synthetic Minority Oversampling Technique) and Random Over-sampling. The data set used consists of 10. 000 bank customer data, with 12 important attributes, including churn indicators as targets. The machine learning algorithms used are Random Forest and Neive Bayes, evaluated based on accuracy, precision, recall, and F1 scores. The results of the experiment showed that the highest accuracy of 87.13% could be achieved with the Random Forest algorithm without using the oversampling method, but its effectiveness in detecting churn customers was slightly limited. The use of SMOTE and Random Over-sampling methods has improved the model's performance in identifying churn patterns, although it has led to a decrease in accuracy to 86.20% for Random Over-sampling and 81.47% for SMOTE. Nevertheless, the Neive Bayes algorithm showed the best accuracy rate of 79.20% without oversampling, although it was still slightly lacking in optimal churn handling. The study underscores the importance of using oversampling methods to improve prediction balance in minority classes, which is often overlooked in conventional models. It is hoped that the results of this research can be used as a guide in improving strategies to maintain customer trust that are more up-to-date and efficient.
References
Anggelia, D., Riti, Y. F., & Siswanto, P. W. (2024). Analisis Perbandingan Metode Arima Dan Least Square Untuk Prediksi Harga Emas : Pendekatan Probabilistik Dan Statistik. Jurnal Sistem Informasi Dan Informatika (Simika), 7(1), 95–103. https://doi.org/10.47080/simika.v7i1.3197
Anjani, A. F., Anggraeni, D., & Tirta, I. M. (2023). Implementasi Random Forest Menggunakan SMOTE untuk Analisis Sentimen Ulasan Aplikasi Sister for Students UNEJ. Jurnal Nasional Teknologi Dan Sistem Informasi, 9(2), 163–172. https://doi.org/10.25077/teknosi.v9i2.2023.163-172
Azmi, A. F. (2024). Prediksi Churn Nasabah Bank Menggunakan Klasifikasi Random Forest Dan Decision Tree Dengan Evaluasi Confusion Matrix. Komputa : Jurnal Ilmiah Komputer Dan Informatika, 13(1), 111–119. https://ojs.unikom.ac.id/index.php/komputa/article/view/12639
Chieka, M., & Kurniasih, A. (2023). Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve bayes dan Logistic Regression. Prosiding Seminar Nasional Mahasiswa Bidang Ilmu Komputer Dan Aplikasinya, 4(2), 552–559.
Christanto, R., & Wiwik, A. (2024). Analisis Churn Pelanggan Menggunakan Pembelajaran Mesin untuk Meningkatkan Retensi Pelanggan di Vissie Net. Jurnal Internasional Penelitian dan Manajemen Ilmiah (IJSRM), September, 7379–7387. https://doi.org/10.18535/ijsrm/v12i09.em05
Farid Naufal, M., Fernando Susanto, A., Nathaneil Kansil, C., Huda, S., & kunci, K. (2023). Analisis Perbandingan Algoritma Machine Learning untuk Prediksi Potensi Hilangnya Nasabah Bank Application of Machine Learning to Predict Potential Loss of Bank Customer. Techno ,22(1), 1–11.
Hartawan, M. S., Erkamim, M., Rachmawati, S., Santi, N. C., Legito, L., & Sepriano, S. (2023). Penerapan Algoritma Supervised Learning untuk Klasifikasi Program Keluarga Harapan. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(2), 83–91. https://doi.org/10.57152/malcom.v3i2.873
Hermawan, A. (2024). Membangun Model Prediksi Churn Pelanggan yang Akurat. Jurnal Riset Sistem Informasi dan Teknik Informatika,2(6), 68–80. https://doi.org/https://doi.org/10.61132/merkurius.v2i6.398
Istiqomawati, R., Quraisy, M., Widiyastuti, A., & Yogyakarta, S. (2022). Pengaruh Switching Cost terhadap Customer Retention di Bank Syariah. Jurnal Kajian Akutansi Dan Keuangan, 2(2), 7–14. https://doi.org/10.56393/pacioli.v2i2.1346
Munir, A. S., Saputra, A. B., Aziz, A., & Barata, M. A. (2024). Perbandingan Akurasi Algoritma Neive Bayes dan Algoritma Decision Tree dalam Pengklasifikasian Penyakit Kanker Payudara. Jurnal Ilmiah Informatika Global, 15(1), 23–29. https://doi.org/10.36982/jiig.v15i1.3578
Namira, Slamet, I., & Susanto, I. (2024). Prediksi Nasabah Churn Dengan Algoritma Decision Tree, Random Forest Dan Support Vector Machine. Escaf, 27(2) 1045–1053. https://help.sap.com
Novitasari, D. T., Barata, M. A., Rochmatin, N. N., Muzakka, M. A., & Andiyani, P. (2024). Analisis Penerapan Program Reward Kepada Customer Menggunakan Metode Clustering. Jurnal Bisnis Kolega, 10(1), 29–35. https://doi.org/10.57249/jbk.v10i1.140
Nugroho, A., & Rilvani, E. (2024). Penerapan Metode Oversampling SMOTE Pada Algoritma Random Forest Untuk Prediksi Kebangkrutan Perusahaan Application of the SMOTE Oversampling Method to the Random Forest Algorithm for Predicting Company Bankruptcy. Februari, 22(1), 207–214.
Nur Azizah, A., Falach Asy’ari, M., Wisma Dwi Prastya, I., & Purwitasari, D. (2023). Easy Data Augmentation untuk Data yang Imbalance pada Konsultasi Kesehatan Daring. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(5), 1095–1104. https://doi.org/10.25126/jtiik.20231057082
Silfana, F. I., & Barata, M. A. (2024). Using K-NN Algorithm for Evaluating Feature Selection on High. Jurnal Teknik Informatika, 17(2), 191-202. https://doi.org/10.15408/jti.v17i2.40866
Syarif, M., & Nugraha, W. (2023). Mwmote Dalam Mengatasi Ketidakseimbangan Kelas Pada Prediksi Churn Menggunakan Klasifikasi C4.5. JATI (Jurnal Mahasiswa Teknik Informatika) 7(1), 54–62.
Syukron, A., Sardiarinto, S., Saputro, E., & Widodo, P. (2023). Penerapan Metode Smote Untuk Mengatasi Ketidakseimbangan Kelas Pada Prediksi Gagal Jantung. Jurnal Teknologi Informasi Dan Terapan, 10(1), 47–50. https://doi.org/10.25047/jtit.v10i1.313
Taufik, R., Jimah, R., & Solichin, A. (2024). Implementasi dan Analisis Model Machine Learning Decision Tree untuk Deteksi Akun Palsu di Twitter. Jurnal Media Informatika Budidarma, 8(2), 797. https://doi.org/10.30865/mib.v8i2.7548