Predicting Student Academic Performance Based on Learning Habits Using XGBoost and SHAP
DOI:
https://doi.org/10.59934/jaiea.v5i2.1860Keywords:
XGBoost; study habits; academic achievement prediction; SHAP; model evaluationAbstract
This study developed a model for predicting student academic achievement based on learning habits using the XGBoost algorithm and SHAP interpretability techniques. The secondary dataset contains 1,000 entries and 16 variables (for example, hours of study per day, mental health, frequency of exercise, social media use, hours of sleep) pre-processed including cleaning, imputation, encoding, and normalization before being divided into train–test (80:20) and validated using 5-fold CV. Three models were tested: Linear Regression, Random Forest, and XGBoost. Evaluation using RMSE, MAE, and R² showed that XGBoost achieved RMSE = 0.335, MAE = 0.266, and R² = 0.882, while Linear Regression showed the best performance according to R² in certain configurations (R² = 0.888; RMSE = 0.326). SHAP analysis revealed that the most influential features were hours of study per day, mental health scores, exercise frequency, duration of social media use, and hours spent watching Netflix. The findings confirm that students' study habits and psychological conditions are the main determinants of academic achievement variation; the use of interpretable features strengthens the readability of the model for education stakeholders. Research recommendations include testing the model on longitudinal datasets, integrating socioeconomic factors, and implementing data privacy procedures before institutional-scale implementation.
Downloads
References
A. Asselman, M. Khaldi, and S. Aammou, “Enhancing the prediction of student performance based on the machine learning XGBoost algorithm,” Interact. Learn. Environ., vol. 31, no. 6, pp. 3360–3379, 2021.
R. Ed-Daoudi, M. Azhari, B. Ettaki, and J. Zerouaoui, “Academic Performance Prediction in Virtual Environments Using Big Data and Machine Learning,” J. Electr. Syst., vol. 20, no. 3, 2024.
S. Wang and B. Luo, “Academic achievement prediction in higher education through interpretable modeling,” PLoS One, vol. 9, p. e0309838, 2024.
M. Al-Okaily, S. Magatef, A. Al-Okaily, and F. S. Shiyyab, “Exploring the factors that influence academic performance in Jordanian higher education institutions,” Heliyon, vol. 10, no. 13, 2024.
K. Mukesh Kumar, N. Singh, J. Wadhwa, P. Singh, G. Kumar, and A. Qtaishat, “Utilizing Random Forest and XGBoost data mining algorithms for anticipating students’ academic performance,” Int. J. Mod. Educ. Comput. Sci., vol. 16, no. 2, pp. 29–44, 2024.
Tao-Hongli, “Educational data mining for student performance prediction: feature selection and model evaluation,” J. Electr. Syst., vol. 20, no. 3, 2024.
Z. Ersozlu, S. Taheri, and I. Koch, “A review of machine learning methods used for educational data,” Educ. Inf. Technol., vol. 29, pp. 22125–22145, 2024.
J. Lu et al., “Machine learning analysis of factors affecting college students’ academic performance,” Front. Psychol., vol. 15, 2024.
A. J. L. Brambila-Tapia, E. U. Velarde-Partida, L. A. Carrillo-Delgadillo, S. Ramírez-De los Santos, and F. Macías-Espinoza, “Correlation between studying strategies, personal and psychological factors with academic achievement and intelligence in health sciences university students: A cross-sectional study,” BMC Med. Educ., vol. 24, p. 881, 2024.
S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems, 2018, vol. 30, pp. 4765–4774.
M. Lünich and B. Keller, “Explainable artificial intelligence for academic performance prediction: An experimental study on the impact of accuracy and simplicity of decision trees on causability and fairness perceptions,” Front. Artif. Intell., vol. 7, p. 1486392, 2024.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Journal of Artificial Intelligence and Engineering Applications (JAIEA)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.








