The Impact of Principal Component Analysis Dimensionality Reduction on Sentiment Classification Performance Using Support Vector Machine

Authors

  • Azzahra Moudy Fajria STMIK IKMI Cirebon
  • Ahmad Faqih STMIK IKMI Cirebon
  • Gifthera Dwilestari STMIK IKMI Cirebon

DOI:

https://doi.org/10.59934/jaiea.v4i2.744

Keywords:

Principal Component Analysis, Support Vector Machine, Sentiment Analysis, Dimension Reduction

Abstract

This study investigates the application of Principal Component Analysis (PCA) to enhance sentiment classification performance using the Support Vector Machine (SVM) algorithm. User reviews of the ChatGPT application from the Play Store were collected, preprocessed, and analyzed to identify the sentiment within the text (positive, negative, or neutral). The research follows the Knowledge Discovery in Databases (KDD) framework, starting with data selection, preprocessing, transformation, and applying PCA for dimensionality reduction. PCA was used to reduce the complexity of the high-dimensional text data, improving SVM's efficiency in sentiment classification. Evaluation results show that applying PCA led to an improvement in model performance, with accuracy increasing from 72.65% to 73.20%, precision from 71.58% to 72.24%, recall from 71.77% to 72.66%, and F1-score from 71.56% to 72.32%. Although the improvements were modest, the findings demonstrate that PCA effectively simplifies complex datasets and enhances SVM performance in sentiment classification, offering benefits in processing high-dimensional text data.

Downloads

Download data is not yet available.

References

K. A. Rokhman, B. Berlilana, and P. Arsi, “Perbandingan Metode Support Vector Machine Dan Decision Tree Untuk Analisis Sentimen Review Komentar Pada Aplikasi Transportasi Online,” J. Inf. Syst. Manag., vol. 3, no. 1, pp. 1–7, 2021, doi: 10.24076/joism.2021v3i1.341.

C. F. Hasri and D. Alita, “Penerapan Metode NaãVe Bayes Classifier Dan Support Vector Machine Pada Analisis Sentimen Terhadap Dampak Virus Corona Di Twitter,” J. Inform. dan Rekayasa Perangkat Lunak, vol. 3, no. 2, pp. 145–160, 2022, doi: 10.33365/jatika.v3i2.2026.

Rayuwati, Husna Gemasih, and Irma Nizar, “Implementasi Algoritma NaiveBayes Untuk Memprediksi Tingkat Penyebaran Covid-19 Di Indonesia,” Jural Ris. Rumpun Ilmu Tek., vol. 1, no. 1, pp. 38–46, 2022, doi: 10.55606/jurritek.v1i1.127.

D. A. Nugraha and A. S. Wiguna, “Seleksi Fitur Warna Citra Digital Biji Kopi Menggunakan Metode Principal Component Analysis,” Res. Comput. Inf. Syst. Technol. Manag., vol. 3, no. 1, p. 24, 2020, doi: 10.25273/research.v3i1.5352.

M. H. Wicaksono, M. D. Purbolaksono, and S. Al Faraby, “Perbandingan Algoritma Machine Learning untuk Analisis Sentimen Berbasis Aspek pada Review Female Daily,” eProceedings Eng., vol. 10, no. 3, pp. 3591–3600, 2023.

M. Septiani, “Pengenalan Pola Batik Lampung Menggunakan Metode Principal Component Analysis,” J. Inform. dan Rekayasa Perangkat Lunak, vol. 2, no. 4, pp. 552–558, 2022, doi: 10.33365/jatika.v2i4.1612.

I. A. Sapitri and M. Fikry, “Pengklasifikasian Sentimen Ulasan Aplikasi Whatsapp Pada Google Play Store Menggunakan Support Vector Machine,” J. TEKINKOM, vol. 6, no. 1, pp. 1–7, 2023, doi: 10.37600/tekinkom.v6i1.773.

P. Apriyani, A. R. Dikananda, and I. Ali, “Penerapan Algoritma K-Means dalam Klasterisasi Kasus Stunting Balita Desa Tegalwangi,” Hello World J. Ilmu Komput., vol. 2, no. 1, pp. 20–33, 2023, doi: 10.56211/helloworld.v2i1.230.

M. D. Purbolaksono, M. Irvan Tantowi, A. Imam Hidayat, and A. Adiwijaya, “Perbandingan Support Vector Machine dan Modified Balanced Random Forest dalam Deteksi Pasien Penyakit Diabetes,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 2, pp. 393–399, 2021, doi: 10.29207/resti.v5i2.3008.

D. Darwis, E. S. Pratiwi, and A. F. O. Pasaribu, “Penerapan Algoritma Svm Untuk Analisis Sentimen Pada Data Twitter Komisi Pemberantasan Korupsi Republik Indonesia,” Edutic - Sci. J. Informatics Educ., vol. 7, no. 1, pp. 1–11, 2020, doi: 10.21107/edutic.v7i1.8779.

Baiq Nurul Azmi, Arief Hermawan, and Donny Avianto, “Analisis Pengaruh Komposisi Data Training dan Data Testing pada Penggunaan PCA dan Algoritma Decision Tree untuk Klasifikasi Penderita Penyakit Liver,” JTIM J. Teknol. Inf. dan Multimed., vol. 4, no. 4, pp. 281–290, 2023, doi: 10.35746/jtim.v4i4.298.

A. M. Argina, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020, doi: 10.33096/ijodas.v1i2.11.

E. Suryati, Styawati, and A. A. Aldino, “Analisis Sentimen Transportasi Online Menggunakan Ekstraksi Fitur Model Word2vec Text Embedding Dan Algoritma Support Vector Machine (SVM),” J. Teknol. Dan Sist. Inf., vol. 4, no. 1, pp. 96–106, 2023, [Online]. Available: https://doi.org/10.33365/jtsi.v4i1.2445

D. P. Utomo and Mesran, “Analisis Komparasi Metode Klasifikasi Data Mining dan Reduksi Atribut Pada Data Set Penyakit Jantung,” J. Media Inform. Budidarma, vol. 4, no. 2, p. 437, 2020, doi: 10.30865/mib.v4i2.2080.

D. Hediyati and I. M. Suartana, “Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro,” vol. 05, pp. 49–54, 2021.

Downloads

Published

2025-02-15

How to Cite

Fajria, A. M., Faqih, A., & Dwilestari, G. (2025). The Impact of Principal Component Analysis Dimensionality Reduction on Sentiment Classification Performance Using Support Vector Machine. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 4(2), 764–770. https://doi.org/10.59934/jaiea.v4i2.744