Optimization of the K-Nearest Neighbors (KNN) Algorithm in Imbalanced Dataset Classification Using the SMOTE Technique
DOI:
https://doi.org/10.59934/jaiea.v4i2.756Keywords:
KNN, SMOTE, sentiment analysis, Twitter, naturalized playersAbstract
The naturalization of players for Indonesia's national football team has sparked diverse reactions on Twitter, ranging from support to opposition. This situation poses challenges for sentiment analysis, particularly in interpreting public opinion on the policy. A significant challenge arises from the imbalance in sentiment classes, with neutral sentiments outweighing positive and negative ones. This research investigates the effect of class imbalance on sentiment analysis accuracy by employing the KNN algorithm enhanced with the SMOTE technique. A quantitative approach is used, adopting an experimental method aligned with the KDD process stages. The findings reveal that the KNN algorithm without SMOTE achieved an accuracy of 54.77%, with a Precision of 0.65, Recall of 0.57, and F1-Score of 0.44. However, integrating SMOTE with the KNN algorithm significantly improved the outcomes, boosting accuracy to 81.49%, with a Precision of 0.87, Recall of 0.80, and F1-Score of 0.80. These results demonstrate that oversampling techniques like SMOTE are highly effective in mitigating class imbalance and enhancing classification performance, especially for underrepresented classes. This study underscores the efficacy of SMOTE as a solution for addressing class imbalance in sentiment analysis tasks.
Downloads
References
V. Rupapara, F. Rustam, H. F. Shahzad, A. Mehmood, I. Ashraf, And G. S. Choi, “Impact Of SMOTE On Imbalanced Text Features For Toxic Comments Classification Using RVVC Model,” IEEE Access, Vol. 9, Pp. 78621–78634, 2021, Doi: 10.1109/ACCESS.2021.3083638.
D. Elreedy, A. F. Atiya, And F. Kamalov, “A Theoretical Distribution Analysis Of Synthetic Minority Oversampling Technique (SMOTE) For Imbalanced Learning,” Mach. Learn., Vol. 113, No. 7, Pp. 4903–4923, 2024, Doi: 10.1007/S10994-022-06296-4.
S. Rahayu, Y. MZ, J. E. Bororing, And R. Hadiyat, “Implementasi Metode K-Nearest Neighbor (K-NN) Untuk Analisis Sentimen Kepuasan Pengguna Aplikasi Teknologi Finansial FLIP,” Edumatic Jurnal. Pendidikan. Informatika., Vol. 6, No. 1, Pp. 98–106, 2022, Doi: 10.29408/Edumatic.V6i1.5433.
E. F. Swana, W. Doorsamy, And P. Bokoro, “Tomek Link And SMOTE Approaches For Machine Fault Classification With An Imbalanced Dataset,” Sensors, Vol. 22, No. 9, Pp. 1–21, 2022, Doi: 10.3390/S22093246.
K. Pramayasa, I. M. D. Maysanjaya, And I. G. A. A. D. Indradewi, “Analisis Sentimen Program Mbkm Pada Media Sosial Twitter Menggunakan KNN Dan SMOTE,” SINTECH (Science Information. Technolog.) Jurnal., Vol. 6, No. 2, Pp. 89–98, 2023, Doi: 10.31598/Sintechjournal.V6i2.1372.
S. W. Pebrianti, R. Astuti, And F. M. Basysyar, “Penerapan Algoritma K-Nearest Neighbor Dalam Klasifikasi Status Stunting Balita Di Desa Bojongemas,” JATI (Jurnal Mahasiswa. Teknik. Informatika.), Vol. 8, No. 2, Pp. 2479–2488, 2024, Doi: 10.36040/Jati.V8i2.8448.
J. H. Joloudari, A. Marefat, M. A. Nemathollahi, S. S. Oyelere, And S. Hussain, “Effective Class-Imbalance Learning Based On SMOTE And Convolutional Neural Networks,” Applied. Sciences., Vol. 13, No. 6, Pp. 1–34, 2023, Doi: 10.3390/App13064006.
J. W. Iskandar And Y. Nataliani, “Perbandingan Naïve Bayes, SVM, Dan K-NN Untuk Analisis Sentimen Gadget Berbasis Aspek,” Jural. RESTI (Rekayasa Sistem. Dan Teknologi. Informasi), Vol. 5, No. 6, Pp. 1120–1126, 2021, Doi: 10.29207/Resti.V5i6.3588.
H. W. Azizah, O. Nurdiawan, G. Dwilestari, Kaslani, And E. Tohidi, “Klasifikasi Pemberian Bantuan UMKM Cirebon Dengan Menggunakan Algoritma K-Nearest Neighbor,” Journal. Computer. System. Informatics, Vol. 3, No. 3, Pp. 110–115, 2022, Doi: 10.47065/Josyc.V3i3.1392.
J. Supriyanto, D. Alita, And A. R. Isnain, “Penerapan Algoritma K-Nearest Neighbor (K-NN) Untuk Analisis Sentimen Publik Terhadap Pembelajaran Daring,” Jurnal. Informatika. Dan Rekayasa Perangkat Lunak, Vol. 4, No. 1, Pp. 74–80, 2023, Doi: 10.33365/Jatika.V4i1.2468.
S. D. Prasetyo, S. S. Hilabi, And F. Nurapriani, “Analisis Sentimen Relokasi Ibukota Nusantara Menggunakan Algoritma Naïve Bayes Dan KNN,” Jurnal. Komtekinfo, Vol. 10, No. 1, Pp. 1–7, 2023, Doi: 10.35134/Komtekinfo.V10i1.330.
W. E. Nurjanah, R. Setya Perdana, And M. A. Fauzi, “Analisis Sentimen Terhadap Tayangan Televisi Berdasarkan Opini Masyarakat Pada Media Sosial Twitter Menggunakan Metode K-Kearest Neighbor Dan Pembobotan Jumlah Retweet,” Jurnal. Pengembangan. Teknologi. Informasi. Dan Ilmu Komputer., Vol. 1, No. 12, Pp. 1750–1757, 2017.
N. Amalia, T. Suprapti, And G. Dwilestari, “Analisis Sentimen Pengguna Twitter Terhadap Pelaksanaan Kurikulum Mbkm,” E-Link Jurnal. Teknik. Elektro Dan Informatika., Vol. 18, No. 1, P. 57, 2023, Doi: 10.30587/E-Link.V18i1.5335.
D. Nurwahidah, G. Dwilestari, N. Dienwati Nuris, And R. Narasati, “Analisis Sentimen Data Ulasan Pengguna Aplikasi Google Kelas Pada Google Play Store Menggunakan Algoritma Naïve Bayes,” JATI (Jurnal Mahasiswa. Teknik. Informatika.), Vol. 7, No. 6, Pp. 3673–3678, 2024, Doi: 10.36040/Jati.V7i6.8245.
A. J. Mohamme, M. M. Hassan, And D. H. Kadir, “Improving Classification Performance For A Novel Imbalanced Medical Dataset Using SMOTE Method,” International Journal of Advanced Trends in Computer Science and Engineering., Vol. 9, No. 3, Pp. 3161–3172, 2020, Doi: 10.30534/Ijatcse/2020/104932020.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Artificial Intelligence and Engineering Applications (JAIEA)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.