Integration of the Naive Bayes Algorithm in Website-Based Detection of Hoaxes Related to Nutritious Food Health Information

Khairul Reza Bakara; Didi Febrian; Yulita Molliq Rangkuti; Zulfahmi Indra; Kana Saputra S

doi:10.59934/jaiea.v5i1.1794

Authors

Khairul Reza Bakara Universitas Negeri Medan
Didi Febrian Universitas Negeri Medan
Yulita Molliq Rangkuti Universitas Negeri Medan
Zulfahmi Indra Universitas Negeri Medan
Kana Saputra S Universitas Negeri Medan

DOI:

https://doi.org/10.59934/jaiea.v5i1.1794

Keywords:

Naive Bayes, TF-IDF, Hoax Detection, Health News, Text Classification, Website

Abstract

There is an intelligent solution for automatic detection due to the increasing number of health-related hoaxes, especially those concerning nutritious food. The aim of this study is to integrate the Multinomial Naive Bayes algorithm into a hoax detection system that focuses on health information about nutritious food found on the internet. A quantitative method was employed, using the Multinomial Naive Bayes algorithm and the Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction technique. The dataset used consists of 1,000 Indonesian-language news articles collected from five platforms: TurnBackHoax for hoaxes, and Detik, Kompas, Tempo, and the Ministry of Health for valid news. The data was divided into 800 training samples and 200 testing samples. The results of this study show a Precision of 0.9717, Recall of 0.9712, and F1-Score of 0.9712, as indicated by the Weighted Average, which accounts for the number of instances in each class. The overall model accuracy is 0.97125, based on the proportion of correctly classified data. These findings demonstrate that the system is capable of identifying distinctive linguistic patterns that differentiate between valid and invalid information. This indicates that probabilistic statistical techniques such as Naive Bayes are highly suitable for use in text-based fake information detection, particularly in the domain of health-related nutritious food information.

Downloads

Download data is not yet available.

References

Agustina, N., Adrian, A., & Hermawati, M. (2022). Implementasi Algoritma Naïve Bayes Classifier untuk Mendeteksi Berita Palsu pada Sosial Media. Faktor Exacta, 14(4), 206. https://doi.org/10.30998/faktorexacta.v14i4.11259

Aisyah, S., Dika, M. F. Z., Yasmin, A., Hanifah, T. P., & Pradana, F. B. A. (2022). Hoax News and Future Threats: A Study of the Constitution, Pancasila, and the Law (Vol. 1). https://doi.org/10.15294/ijpgc.v1i1.56881

Albab, M. U., Karuniawati P, Y., & Fawaiq, M. N. (2023). Optimization of the Stemming Technique on Text Preprocessing President 3 Periods Topic. Jurnal TRANSFORMATIKA, 20(2), 1–10. https://doi.org/10.26623/transformatika.v20i2.5374

Pangesti, I. A., Zen, N. A., & Kurnianto, D. (2023). Evaluasi Kinerja Algoritma Naïve Bayes pada Sistem Deteksi Berita Hoax. Journal of Informatics and Communications Technology, 5(2), 103–110. https://doi.org/10.52661

Kasingku, J. D. (2023). Peran Makanan Sehat Dalam Meningkatkan Kesehatan Fisik dan Kerohanian Pelajar. http://ejournal.mandalanursa.org/index.php/JUPE/index

Fajriana, S. (2022). Machine Learning.

Haikal, H. (2020). Persepsi Masyarakat terhadap Hoax Bidang Kesehatan. Jurnal Manajemen Informasi dan Administrasi Kesehatan (JMIAK), 3(2). https://doi.org/10.32585/jmiak.v3i2.836

Halim, J., & Lasut, D. (2024). Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods. Bit-Tech, 7(2), 202–213. https://doi.org/10.32877/bt.v7i2.1697

Khairani, U., Mutiawani, V., & Ahmadian, H. (2024). Pengaruh Tahapan Preprocessing Terhadap Model Indobert dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(4), 887–894. https://doi.org/10.25126/jtiik.1148315

Martantoh, E., & Yanih, N. (2022). Implementasi Metode Naïve Bayes Untuk Klasifikasi Karakteristik Kepribadian Siswa di Sekolah MTS Darussa’adah Menggunakan PHP MySQL. Jurnal Teknologi Sistem Informasi, 3(2), 166–175. https://doi.org/10.35957/jtsi.v3i2.2896

Prasanti, D., & Media Informasi Kesehatan Bagi Masyarakat, P. (2017). The Portrait of Media Health Information for Urban Community in the Digital Era. (Vol. 19, Issue 2).

Mufid, F. L., & Hariandja, T. R. (2019). Efektivitas Pasal 28 Ayat (1) UU ITE tentang Penyebaran Berita Bohong (Hoax). Jurnal Rechtens, 8(2), 179–198. https://doi.org/10.36835/rechtens.v8i2.533

Mustofa, H., & Mahfudh, A. A. (2019). Klasifikasi Berita Hoax dengan Menggunakan Metode Naive Bayes. Walisongo Journal of Information Technology, 1(1), 1. https://doi.org/10.21580/wjit.2019.1.1.3915

Dewi, N. K. (2023). Identifikasi Berita Hoax dengan Menerapkan Algoritma Text Mining. Journal of Informatics, Electrical and Electronics Engineering, 2(3), 65–74. https://doi.org/10.47065/jieee.v2i3.888

Tamba, S. P., Laia, A., Butar Butar, Y. K., & Faculty of Science and Technology. (2023). Penerapan Data Mining untuk Klasifikasi Berita Hoax Menggunakan Algoritma Naive Bayes. Jurnal TEKINKOM, 6(2), 2023. https://doi.org/10.37600/tekinkom.v6i2.922

Permatasari, A., & Suhendi, S. (2020). Rancang Bangun Sistem Informasi Pengelolaan Talent Film berbasis Aplikasi Web. Jurnal Informatika Terpadu, 6(1), 29–37. https://doi.org/10.54914/jit.v6i1.255

Athifahputih, P. Y. R. (2022). Penegakan Hukum terhadap Penyebaran Berita Hoax dilihat dari Tinjauan Hukum. Jurnal Hukum dan Pembangunan Ekonomi, 10(1), 2022.

Rahutomo, F., Pratiwi, I. Y. R., & Ramadhani, D. M. (2019). Eksperimen Naïve Bayes pada Deteksi Berita Hoax Berbahasa Indonesia. Jurnal Penelitian Komunikasi dan Opini Publik, 23(1). https://doi.org/10.33299/jpkop.23.1.1805

Manthovani, R. (2023). Dampak Berita Hoax terhadap Keamanan Negara dalam Perspektif Cyberlaw Bela Negara. Jurnal Magister Ilmu Hukum, 8(2), 14. https://doi.org/10.36722/jmih.v8i2.2305

Alfarizi, M. R. S., Al-farish, M. Z., Taufiqurrahman, M., Ardiansah, G., & Elgar, M. (2023). Penggunaan Python sebagai Bahasa Pemrograman untuk Machine Learning dan Deep Learning. In Karimah Tauhid (Vol. 2, Issue 1).

Putri, S. A. A. (2024). Deteksi Hoax pada Berita Kesehatan Berbahasa Indonesia Menggunakan Algoritma Multinomial Naïve Bayes. Program Studi Teknik Informatika, Fakultas Sains dan Teknologi, Universitas Islam Negeri Maulana Malik Ibrahim Malang.

Septiani, D., & Isabela, I. (2023). Analisis Term Frequency–Inverse Document Frequency (TF-IDF) dalam Temu Kembali Informasi pada Dokumen Teks. SINTESIA: Jurnal Sistem dan Teknologi Informasi Indonesia.

Sriyano, C. S., & Setiawan, E. B. (2021). Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial pada Twitter dengan Fitur Pembobotan TF-IDF.

Tambusai, J. P., Aditia, I. M., Dewi, D. A., & Furnamasari, Y. F. (2021). Runtuhnya Nilai-Nilai Persatuan dan Kesatuan Bangsa Bernegara Akibat Merajarelanya Hoax.

Wirth, N. (1971). Program Development by Stepwise Refinement. Communications of the ACM, 14(4), 221–227. https://doi.org/10.1145/362575.36257