Exploring Job Vacancy Topics and Trends in Indonesia Using Latent Dirichlet Allocation (LDA) and Exploratory Data Analysis (EDA)

Authors

  • Yoga Prasetyo Wibowo Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.59934/jaiea.v5i1.1715

Keywords:

Exploratory Data Analysis, Latent Dirichlet Allocation, Job Vacancies, Jobstreet, Job Trends

Abstract

The Indonesian labor market is rapidly evolving, creating a need for effective analysis to uncover patterns and trends in job vacancies, especially in the digital era where information is abundant and diverse. This study aims to identify dominant topics and vacancy trends to provide insights for job seekers, companies, and stakeholders in developing targeted recruitment strategies. Data were collected through web scraping from the Jobstreet website, covering company, location, position, classification, subclassification, salary, and posting time. Two methods were applied: Exploratory Data Analysis (EDA) to describe vacancy distributions, and Latent Dirichlet Allocation (LDA) to extract thematic structures from text data. EDA results show that Sales Executive is the most demanded position, Greater Jakarta is the location with the highest vacancies, and the most common salary range is IDR 4–5 million. LDA, validated using the Coherence Score, identified two distinct topics: (1) Sales, Marketing, Supply Chain, and Retail, and (2) Customer Service and Administration.. This research contributes by combining descriptive and text-mining techniques on the latest job vacancy dataset obtained directly through web scraping, producing an analysis that not only highlights vacancy distribution but also reveals latent topics reflecting Indonesian labor market trends.

 

Downloads

Download data is not yet available.

References

E. M. Agustyani and I. Santoso, “Analisis Lowongan Pekerjaan Studi Kasus: Portal Lowongan Kerja Jobstreet,” Semin. Nas. Off. Stat. 2019 Pengemb. Off. Stat. dalam mendukung Implementasi SDG’s, pp. 1–10, 2020.

N. C. K.Uray, “Analisis Topic Modelling Pariwisata Yogyakarta Menggunakan Latent Dirichlet Allocation (LDA),” vol. 13, no. 4, pp. 6075–6086, 2024.

E. S. Eriana and D. A. Zein, “Artificial Intelligence,” Angew. Chemie Int. Ed., vol. 6(11), p. 1, 2023.

Y. Waruwu, “Pendidikan Agama Kristen Dalam Era Ai: Menggunakan Kecerdasan Buatan Untuk Personalisasi Pembelajaran Spiritual,” J. Abdiel Khazanah Pemikir. Teol. Pendidik. Agama Kristen dan Musik Gereja, vol. 8, no. 2, pp. 151–165, 2024, doi: 10.37368/ja.v8i2.786.

D. Leni, F. Earnestly, R. Sumiati, A. Adriansyah, and Y. P. Kusuma, “Evaluasi sifat mekanik baja paduan rendah bedasarkan komposisi kimia dan suhu perlakuan panas menggunakan teknik exploratory data analysis (EDA),” Din. Tek. Mesin, vol. 13, no. 1, p. 74, 2023, doi: 10.29303/dtm.v13i1.624.

F. Alfiah et al., Pemodelan Dan Visualisasi Data, no. June. 2025.

Angga Reni Dwi Astuti and N. Cahyono, “Analisis Topic Modelling Persepsi Pengguna Internet Menggunakan Metode Latent Dirichlet Allocation,” Indones. J. Comput. Sci., vol. 12, no. 1, pp. 326–334, 2023, doi: 10.33022/ijcs.v12i1.3155.

C. Natalia, F. Suprata, F. P. S. Surbakti, and S. Clarence, “Penentuan Standar Spesifikasi Kerja di Café Berdasarkan Big Data dengan Metode LDA dan AHP,” J. Rekayasa Sist. Ind., vol. 10, no. 2, pp. 211–226, 2021, doi: 10.26593/jrsi.v10i2.5228.211-226.

E. Puspita, D. F. Shiddieq, and F. F. Roji, “Topic Modeling on Online News Media Using Latent Diriclet Allocation (Case Study Somethinc Brand),” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 4, no. 2, pp. 481–489, 2024.

A. Z. Rizquina and C. I. Ratnasari, “Implementasi Web Scraping untuk Pengambilan Data Pada Website E-Commerce,” J. Teknol. Dan Sist. Inf. Bisnis, vol. 5, no. 4, pp. 377–383, 2023, doi: 10.47233/jteksis.v5i4.913.

Ivana Elfirdaus and Eka Dyar Wahyuni, “Implementasi Web Scraping Untuk Pengambilan Data Rekomendasi Film Pada Imdb,” Pros. Semin. Nas. Teknol. dan Sist. Inf., vol. 3, no. 1, pp. 327–333, 2023, doi: 10.33005/sitasi.v3i1.647.

D. Chrisinta and J. E. Simarmata, “Eksplorasi Teknik Web Scraping pada Data Mining: Pendekatan Pencarian Data Berbasis Python,” Fakt. Exacta, vol. 17, no. 1, pp. 58–68, 2024, doi: 10.30998/faktorexacta.v17i1.22393.

N. Fadhilla Rosia Afrianti and A. Badawi, “Web Scraping Senyawa Herbal Di Indonesia Menggunakan Selenium Python,” J. Sci. Soc. Res., vol. 4307, no. 4, pp. 1362–1366, 2024, [Online]. Available: http://jurnal.goretanpena.com/index.php/JSSR

M. Z. Haq, C. S. Octiva, A. Ayuliana, U. W. Nuryanto, and D. Suryadi, “Algoritma Naïve Bayes untuk Mengidentifikasi Hoaks di Media Sosial,” J. Minfo Polgan, vol. 13, no. 1, pp. 1079–1084, 2024, doi: 10.33395/jmp.v13i1.13937.

Downloads

Published

2025-10-15

How to Cite

Yoga Prasetyo Wibowo. (2025). Exploring Job Vacancy Topics and Trends in Indonesia Using Latent Dirichlet Allocation (LDA) and Exploratory Data Analysis (EDA). Journal of Artificial Intelligence and Engineering Applications (JAIEA), 5(1), 1778–1785. https://doi.org/10.59934/jaiea.v5i1.1715