Optimizing Naïve Bayes Algorithm Through Principal Component Analysis To Improve Dengue Fever Patient Classification Model
DOI:
https://doi.org/10.59934/jaiea.v4i2.798Keywords:
Naïve Bayes, Principal Component Analysis, Classification, Dengue FeverAbstract
Dengue fever is an infectious disease that has a significant impact on public health in tropical regions, including Indonesia. Early detection and proper classification of DHF patients is essential to reduce severity and mortality. For this reason, a method that can improve the accuracy in diagnosing this disease is needed. Principal Component Analysis (PCA) and Naïve Bayes (NB) are two commonly used techniques in medical data analysis. PCA is used to reduce the dimensionality of data to reduce complexity, while Naïve Bayes is used for classification of data based on probability. This study aims to optimize the use of PCA and Naïve Bayes in improving the accuracy of the dengue patient classification model. The method used in this study involves processing a medical dataset of dengue patients containing various clinically relevant attributes. The dataset was then processed using PCA to reduce dimensionality and identify key features that affect classification. Next, Naïve Bayes was applied to classify the data based on the selected features. This study compares the performance of classification models that use a combination of PCA and Naïve Bayes with models that only use Naïve Bayes without dimensionality reduction. The results show that the use of PCA in data processing significantly improves the accuracy of the classification model compared to the model that only uses Naïve Bayes. The combination of PCA and Naïve Bayes produces a more efficient model and has a higher accuracy rate in identifying patients with DHF risk. Thus, the application of PCA and Naïve Bayes in the classification of DHF patients can be an effective tool in assisting the medical diagnosis process, which in turn can reduce misdiagnosis and improve patient recovery rates. This research contributes to the development of artificial intelligence technology in the medical field, especially to improve the accuracy of dengue disease diagnosis, and serves as a basis for further research in the use of machine learning techniques in healthcare. This study analyzes the performance of the Naïve Bayes algorithm in classifying dengue fever patient data, by comparing models that use Principal Component Analysis (PCA) as a dimension reduction method and models that do not use it. The results show that the Naïve Bayes model without PCA has an accuracy of 49.96%, which is close to the random guess rate. This finding indicates that the model is less effective in recognizing patterns in the data. In contrast, the application of PCA successfully increased the model's accuracy to 50.03%
Downloads
References
L. Chaves, “Data mining techniques for early diagnosis of diabetes: a comparative study,” Appl. Sci., vol. 11, no. 5, p. 2218, 2021, doi: 10.3390/app11052218.
C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Front. Energy Res., vol. 9, 2021, doi: 10.3389/fenrg.2021.652801.
A. Mohammed, “Decision tree, naïve bayes and support vector machine applying on social media usage in nyc / comparative analysis,” Tikrit J. Pure Sci., vol. 22, no. 9, pp. 94–99, 2023, doi: 10.25130/tjps.v22i9.881.
M. Ragab, “Optimized Classification Model for Biomedical Data Analysis,” Ann. Adv. Biomed. Sci., vol. 6, no. 1, 2023, doi: 10.23880/aabsc-16000204.
S. P. Simelane, C. Hansen, and C. Munghemezulu, “The Use of Remote Sensing and GIS for Land Use and Land Cover Mapping in Eswatini: A Review,” South African J. Geomatics, vol. 10, no. 2, pp. 181–206, 2022, doi: 10.4314/sajg.v10i2.13.
V. Vijay Anuradha, N. Anbalagan, “Clinical presentation and platelet profile of dengue fever: a retrospective study,” Cureus, 2022, doi: 10.7759/cureus.28626.
C. Ouattara, “Spatio-temporal determinants of dengue epidemics in the central region of burkina faso,” Trop. Med. Infect. Dis., vol. 8, no. 11, p. 482, 2023, doi: 10.3390/tropicalmed8110482.
N. Hamdani Hatta, H., Puspitasari, A. Septiarini, and H. Henderi, H., “Dengue classification method using support vector machines and cross-validation techniques,” Iaes Int. J. Artif. Intell., vol. 11, no. 3, p. 1119, 2022, doi: 10.11591/ijai.v11.i3.pp1119-1129.
Y. Salim Wah, C. Reeves, M. Smith, W. Yaacob, R. Mudin, and N. Haque, U., “Prediction of dengue outbreak in selangor malaysia using machine learning techniques,” Sci. Rep., vol. 11, no. 1, 2021, doi: 10.1038/s41598-020-79193-2.
Y. A. Wijaya, N. Suarna, Iin, R. Hamonangan, and R. Nining, “Comparison of machine learning algorithm for Santander dataset,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1088, no. 1, p. 012032, 2021, doi: 10.1088/1757-899x/1088/1/012032.
R. Tarigan, “Artificial neural network for classification of dengue fever using backpropagation algorithm,” J. Artif. Intell. Eng. Appl., vol. 3, no. 1, pp. 468–478, 2023, doi: 10.59934/jaiea.v3i1.357.
A. and P. Rahman S., “Performance analysis of the hybrid voting method on the classification of the number of cases of dengue fever,” Int. J. Inf. Commun. Technol., vol. 8, no. 1, pp. 10–19, 2022, doi: 10.21108/ijoict.v8i1.614.
N. A. Salim et al., “Prediction of Dengue Outbreak in Selangor Malaysia Using Machine Learning Techniques,” Sci. Rep., vol. 11, no. 1, 2021, doi: 10.1038/s41598-020-79193-2.
M. Altayeb and A. Arabiat, “Enhancing stroke prediction using the waikato environment for knowledge analysis,” IAES Int. J. Artif. Intell., vol. 13, no. 3, pp. 3010–3017, 2024, doi: 10.11591/ijai.v13.i3.pp3010-3017.
M. Shenify, “Sentiment analysis of Saudi e-commerce using naïve bayes algorithm and support vector machine,” Int. J. Data Netw. Sci., vol. 8, no. 3, pp. 1607–1612, 2024, doi: 10.5267/j.ijdns.2024.3.006.
O. Alghushairy et al., “An Efficient Support Vector Machine Algorithm Based Network Outlier Detection System,” IEEE Access, vol. 12, pp. 24428–24441, 2024, doi: 10.1109/ACCESS.2024.3364400.
M. A. Hassan, A. H. Muse, and S. Nadarajah, “Predicting Student Dropout Rates Using Supervised Machine Learning: Insights from the 2022 National Education Accessibility Survey in Somaliland,” Appl. Sci., vol. 14, no. 17, 2024, doi: 10.3390/app14177593.
D. Monteverde-Suárez et al., “Predicting students’ academic progress and related attributes in first-year medical students: an analysis with artificial neural networks and Naïve Bayes,” BMC Med. Educ., vol. 24, no. 1, 2024, doi: 10.1186/s12909-023-04918-6.
A. A. Stonier, R. K. Gorantla, and K. Manoj, “Cardiac disease risk prediction using machine learning algorithms,” Healthc. Technol. Lett., vol. 11, no. 4, pp. 213–217, 2024, doi: 10.1049/htl2.12053.
I. G. A. P. Mahendra, I. M. A. Wirawan, and I. G. A. Gunadi, “Enhancement performance of the Naïve Bayes method using AdaBoost for classification of diabetes mellitus dataset type II,” Int. J. Adv. Appl. Sci., vol. 13, no. 3, pp. 733–742, 2024, doi: 10.11591/ijaas.v13.i3.pp733-742.
Y. K. Saheed, T. O. Kehinde, M. Ayobami Raji, and U. A. Baba, “Feature selection in intrusion detection systems: a new hybrid fusion of Bat algorithm and Residue Number System,” J. Inf. Telecommun., vol. 8, no. 2, pp. 189–207, 2024, doi: 10.1080/24751839.2023.2272484.
A. K. Singh, R. Sunkara, G. R. Kadambi, and V. Palade, “Spectral-Spatial Classification With Naive Bayes and Adaptive FFT for Improved Classification Accuracy of Hyperspectral Images,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 17, pp. 1100–1113, 2024, doi: 10.1109/JSTARS.2023.3327346.
D. Morakis and A. Adamopoulos, “Hybrid Machine Learning Algorithms to Evaluate Prostate Cancer,” Algorithms, vol. 17, no. 6, 2024, doi: 10.3390/a17060236.
S. Sulak and N. Koklu, “Analysis of Depression, Anxiety, Stress Scale (DASS-42) With Methods of Data Mining,” Eur. J. Educ., vol. 59, no. 4, 2024, doi: 10.1111/ejed.12778.
S. M. Hameed, W. A. Ahmed, and M. A. Othman, “Leukemia Diagnosis using Machine Learning Classifiers based on MRMR Feature Selection,” Eng. Technol. Appl. Sci. Res., vol. 14, no. 4, pp. 15614–15619, 2024, doi: 10.48084/etasr.7720.
M. Y. Shams, S. A. Gamel, and F. M. Talaat, “Enhancing crop recommendation systems with explainable artificial intelligence: a study on agricultural decision-making,” Neural Comput. Appl., vol. 36, no. 11, pp. 5695–5714, 2024, doi: 10.1007/s00521-023-09391-2.
S. Khasim, I. S. Rahat, H. Ghosh, K. Shaik, and S. K. Panda, “Using Deep Learning and Machine Learning: Real-Time Discernment and Diagnostics of Rice-Leaf Diseases in Bangladesh,” EAI Endorsed Trans. Internet Things, vol. 10, 2024, doi: 10.4108/eetiot.4579.
Z. Albataineh, F. Aldrweesh, and M. A. Alzubaidi, “COVID-19 CT-images diagnosis and severity assessment using machine learning algorithm,” Cluster Comput., vol. 27, no. 1, pp. 547–562, 2024, doi: 10.1007/s10586-023-03972-5.
S. Yaman, B. Karakaya, and M. Köküm, “A neural network approach for classification of fault-slip data in geoscience,” Ain Shams Eng. J., vol. 15, no. 1, 2024, doi: 10.1016/j.asej.2023.102325.
B.-Q.-V. Nguyen, L.-H.-P. Ho, and Y.-T. Kim, “An Ensemble Model of Logistic Regression, Naïve Bayes, and Adaboost for Assessing the Landslide Spatial Probability-Study Case: Phuoc Son, Quang Nam, Vietnam and Umyeon, Seoul, Korea,” Civ. Eng. Archit., vol. 12, no. 3, pp. 2010–2028, 2024, doi: 10.13189/cea.2024.121307.
M. El Mahjouby, M. Taj Bennani, M. Lamrini, B. Bossoufi, T. A. H. Alghamdi, and M. El Far, “Machine Learning Algorithms for Forecasting and Categorizing Euro-to-Dollar Exchange Rates,” IEEE Access, vol. 12, pp. 74211–74217, 2024, doi: 10.1109/ACCESS.2024.3404824.
D. F. Abdlkader and M. F. Ghanim, “Design and analysis of face recognition system based on VGG-Face-16 with various classifiers,” IAES Int. J. Artif. Intell., vol. 13, no. 2, pp. 1499–1510, 2024, doi: 10.11591/ijai.v13.i2.pp1499-1510.
M. Y. A. Qahar, Y. Ruldeviyani, U. N. Mukharomah, M. A. Fidyawan, and R. Putra, “Factor analysis influencing Mobile JKN user experience using sentiment analysis,” IAES Int. J. Artif. Intell., vol. 13, no. 2, pp. 1782–1793, 2024, doi: 10.11591/ijai.v13.i2.pp1782-1793.
L. Li, A. Woodley, and T. Chappell, “Mapping Urban Floods via Spectral Indices and Machine Learning Algorithms,” Sustain. , vol. 16, no. 6, 2024, doi: 10.3390/su16062493.
N. A. S. Abdullah, N. I. A. Rusli, and N. S. Yuslee, “Development of a machine learning algorithm for fake news detection,” Indones. J. Electr. Eng. Comput. Sci., vol. 35, no. 3, pp. 1732–1743, 2024, doi: 10.11591/ijeecs.v35.i3.pp1732-1743.
I. Zada et al., “Fine-Tuning Cyber Security Defenses: Evaluating Supervised Machine Learning Classifiers for Windows Malware Detection,” Comput. Mater. Contin., vol. 80, no. 2, pp. 2917–2939, 2024, doi: 10.32604/cmc.2024.052835.
D. R. Anamisa, A. Jauhari, and F. A. Mufarroha, “PERFORMANCE TEST OF NAIVE BAYES AND SVM METHODS ON CLASSIFICATION OF MALNUTRITION STATUS IN CHILDREN,” Commun. Math. Biol. Neurosci., vol. 2024, 2024, doi: 10.28919/cmbn/8429.
P. A. Barracloug et al., “Artificial Intelligence System for Malaria Diagnosis,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 3, pp. 920–932, 2024, doi: 10.14569/IJACSA.2024.0150392.
M. Jamil, H. Hadiyanto, and R. Sanjaya, “Sentiment Analysis: Classifying Public Comments on YouTube in Disaster Management Simulation in Indonesia Using Naïve Bayes and Support Vector Machine,” Ing. des Syst. d’Information, vol. 29, no. 2, pp. 437–446, 2024, doi: 10.18280/isi.290205.
L. N. CheSuh, R. Á. Fernández-Diaz, J. M. Alija-Perez, C. Benavides-Cuellar, and H. Alaiz-Moreton, “Improve quality of service for the Internet of Things using Blockchain & machine learning algorithms,” Internet of Things (Netherlands), vol. 26, 2024, doi: 10.1016/j.iot.2024.101123.
A. Arsalane, A. Klilou, and N. El Barbri, “Performance evaluation of machine learning algorithms for meat freshness assessment,” Int. J. Electr. Comput. Eng., vol. 14, no. 5, pp. 5858–5865, 2024, doi: 10.11591/ijece.v14i5.pp5858-5865.
W. Darmawan, “Komparasi Metode Klasifikasi Untuk Analisis Sentimen Pengguna Twitter Terhadap Penerapan Kurikulum Merdeka,” Ic-Tech, vol. 18, no. 1, pp. 9–15, 2023, doi: 10.47775/ictech.v18i1.262.
J. Maulani and M. Sari, “Komparasi Metode K-Nearest Neighbor (Knn) Dengan Support Vector Machine (Svm) Terhadap Tingkat Akurasi Klasifikasi Kualitas Air,” Smart Comp Jurnalnya Orang Pint. Komput., vol. 12, no. 2, 2023, doi: 10.30591/smartcomp.v12i2.4205.
Y. Widhiyasana, T. Semiawan, I. G. A. Mudzakir, and M. R. Noor, “Penerapan Convolutional Long Short-Term Memory Untuk Klasifikasi Teks Berita Bahasa Indonesia,” J. Nas. Tek. Elektro Dan Teknol. Inf., vol. 10, no. 4, pp. 354–361, 2021, doi: 10.22146/jnteti.v10i4.2438.
B. Molina-Coronado, U. Mori, A. Mendiburu, and J. Miguel-Alonso, “Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases Process,” Ieee Trans. Netw. Serv. Manag., vol. 17, no. 4, pp. 2451–2479, 2020, doi: 10.1109/tnsm.2020.3016246.
M. Defriani and I. Jaelani, “Recognition of Regional Traditional House in Indonesia Using Convolutional Neural Network (CNN) Method,” J. Comput. Networks Archit. High Perform. Comput., vol. 4, no. 2, pp. 104–115, 2022, doi: 10.47709/cnahpc.v4i2.1562.
Z. Amri, “Prediksi Tingkat Kelulusan Mahasiswa Menggunakan Algoritma Naïve Bayes, Decision Tree, ANN, KNN, Dan SVM,” Edumatic J. Pendidik. Inform., vol. 7, no. 2, pp. 187–196, 2023, doi: 10.29408/edumatic.v7i2.18620.
Y. Elda, S. Defit, Y. Yunus, and R. Syaljumairi, “Klasterisasi Penempatan Siswa Yang Optimal Untuk Meningkatkan Nilai Rata-Rata Kelas Menggunakan K-Means,” J. Inf. Dan Teknol., pp. 103–108, 2021, doi: 10.37034/jidt.v3i3.130.
- Rezki, S. Defit, and S. Sumijan, “Metode K-Means Clustering Untuk Mengukur Tingkat Kedisiplinan Pegawai (Studi Kasus Di Pemerintah Kabupaten Padang Pariaman),” J. Coscitech (Computer Sci. Inf. Technol., vol. 4, no. 1, pp. 116–125, 2023, doi: 10.37859/coscitech.v4i1.4728.
M. D. Hendriyanto, A. A. Ridha, and U. Enri, “Analisis Sentimen Ulasan Aplikasi Mola Pada Google Play Store Menggunakan Algoritma Support Vector Machine,” Intecoms J. Inf. Technol. Comput. Sci., vol. 5, no. 1, pp. 1–7, 2022, doi: 10.31539/intecoms.v5i1.3708.
E. Tohidi, “Analisa Sentimen Komentar Video Youtube Di Channel Tvonenews Tentang Calon Presiden Prabowo Subianto Menggunakan Support Vector Machine,” Jati (Jurnal Mhs. Tek. Inform., vol. 8, no. 1, pp. 660–667, 2024, doi: 10.36040/jati.v8i1.8560.
J. R. N. A. Gunawardana, S. D. Viswakula, R. P. Rannan-Eliya, and N. Wijemunige, “Machine learning approaches for asthma disease prediction among adults in Sri Lanka,” Health Informatics J., vol. 30, no. 3, 2024, doi: 10.1177/14604582241283968.
W. B. Demilie, “Plant disease detection and classification techniques: a comparative study of the performances,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-023-00863-9.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Artificial Intelligence and Engineering Applications (JAIEA)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.