The Role of Study Habits, Parental Involvement, and School Environment in Predicting Student Achievement: A Machine Learning Perspective
DOI:
https://doi.org/10.60084/jeml.v3i2.350Keywords:
Machine learning, Student achievement, Predictive analytics, Educational data miningAbstract
This study explores the application of machine learning techniques to predict student achievement based on study habits, parental involvement, and school environment. Using a dataset from Kaggle comprising academic, behavioral, and contextual variables, four machine learning algorithms, namely K-Nearest Neighbors (KNN), Naïve Bayes, Support Vector Machine (SVM), and Random Forest, were implemented and evaluated. Model performance was evaluated using accuracy, precision, recall, F1-score, ROC curve, and Precision–Recall curves. Results show that all models effectively classified students into low- and high-achievement categories, with SVM achieving the highest accuracy (94.02%) and the strongest overall performance. The findings highlight the potential of machine learning-driven predictive analytics in educational settings, enabling early identification of at-risk students and supporting evidence-based interventions. By integrating diverse factors influencing academic performance, this study demonstrates how data-driven approaches can enhance educational management, inform policy, and promote equitable learning outcomes.
Downloads
References
- Zysberg, L., and Schwabsky, N. (2021). School Climate, Academic Self-Efficacy and Student Achievement, Educational Psychology, Vol. 41, No. 4, 467–482. doi:10.1080/01443410.2020.1813690.
- Hepworth, D., Littlepage, B., and Hancock, K. (2018). Factors Influencing University Student Academic Success., Educational Research Quarterly, Vol. 42, No. 1, 45–61.
- Trevino, N. N., and DeFreitas, S. C. (2014). The Relationship between Intrinsic Motivation and Academic Achievement for First Generation Latino College Students, Social Psychology of Education, Vol. 17, No. 2, 293–306. doi:10.1007/s11218-013-9245-3.
- Liu, J., Peng, P., Zhao, B., and Luo, L. (2022). Socioeconomic Status and Academic Achievement in Primary and Secondary Education: A Meta-Analytic Review, Educational Psychology Review, Vol. 34, No. 4, 2867–2896. doi:10.1007/s10648-022-09689-y.
- Vadivel, B., Alam, S., Nikpoo, I., and Ajanil, B. (2023). The Impact of Low Socioeconomic Background on a Child’s Educational Achievements, Education Research International, Vol. 2023, 1–11. doi:10.1155/2023/6565088.
- Marks, G. N., Cresswell, J., and Ainley, J. (2006). Explaining Socioeconomic Inequalities in Student Achievement: The Role of Home and School Factors, Educational Research and Evaluation, Vol. 12, No. 2, 105–128. doi:10.1080/13803610600587040.
- Kyriazos, T., and Poga, M. (2024). Application of Machine Learning Models in Social Sciences: Managing Nonlinear Relationships, Encyclopedia, 1790–1805. doi:10.3390/encyclopedia4040118.
- Almalawi, A., Soh, B., Li, A., and Samra, H. (2024). Predictive Models for Educational Purposes: A Systematic Review, Big Data and Cognitive Computing. doi:10.3390/bdcc8120187.
- Meylani, R. (2024). A Comparative Analysis of Traditional and Modern Approaches to Assessment and Evaluation in Education, Batı Anadolu Eğitim Bilimleri Dergisi, Vol. 15, No. 1, 520–555. doi:10.51460/baebd.1386737.
- Cao, W., and Mai, N. (2025). Predictive Analytics for Student Success: AI-Driven Early Warning Systems and Intervention Strategies for Educational Risk Management, Educational Research and Human Development, Vol. 2, No. 2, 36–48.
- Rane, N. L., Paramesha, M., Choudhary, S. P., and Rane, J. (2024). Machine Learning and Deep Learning for Big Data Analytics: A Review of Methods and Applications, Partners Universal International Innovation Journal, Vol. 2, No. 3, 172–197. doi:10.5281/zenodo.12271006.
- Noviandy, T. R., Maulana, A., Idroes, G. M., Suhendra, R., Afidh, R. P. F., and Idroes, R. (2024). An Explainable Multi-Model Stacked Classifier Approach for Predicting Hepatitis C Drug Candidates, Sci, Vol. 6, No. 4, 81. doi:10.3390/sci6040081.
- Noviandy, T. R., Maulana, A., Irvanizam, I., Idroes, G. M., Maulydia, N. B., Tallei, T. E., Subianto, M., and Idroes, R. (2025). Interpretable Machine Learning Approach to Predict Hepatitis C Virus NS5B Inhibitor Activity Using Voting-Based LightGBM and SHAP, Intelligent Systems with Applications, Vol. 25, 200481. doi:10.1016/j.iswa.2025.200481.
- Janiesch, C., Zschech, P., and Heinrich, K. (2021). Machine Learning and Deep Learning, Electronic Markets, Vol. 31, No. 3, 685–695. doi:10.1007/s12525-021-00475-2.
- Goren, O., Cohen, L., and Rubinstein, A. (2024). Early Prediction of Student Dropout in Higher Education Using Machine Learning Models, Proceedings of the 17th International Conference on Educational Data Mining, 349–359.
- Namoun, A., and Alshanqiti, A. (2020). Predicting Student Performance Using Data Mining and Learning Analytics Techniques: A Systematic Literature Review, Applied Sciences, Vol. 11, No. 1, 237. doi:10.3390/app11010237.
- Maulana, A., Idroes, G. M., Kemala, P., Maulydia, N. B., Sasmita, N. R., Tallei, T. E., Sofyan, H., and Rusyana, A. (2023). Leveraging Artificial Intelligence to Predict Student Performance: A Comparative Machine Learning Approach, Journal of Educational Management and Learning, Vol. 1, No. 2, 64–70. doi:10.60084/jeml.v1i2.132.
- Lai, N. (2025). Student Performance Factors .
- Rahmanparast, A., Milani, M., Camci, M., Karakoyun, Y., Acikgoz, O., and Dalkilic, A. S. (2025). A Comprehensive Method for Exploratory Data Analysis and Preprocessing the ASHRAE Database for Machine Learning, Applied Thermal Engineering, Vol. 273, 126556. doi:10.1016/j.applthermaleng.2025.126556.
- Ahsan, M., Mahmud, M., Saha, P., Gupta, K., and Siddique, Z. (2021). Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance, Technologies, Vol. 9, No. 3, 52. doi:10.3390/technologies9030052.
- Muksalmina, M., Syahyana, A., Hidayatullah, F., Idroes, G. M., and Noviandy, T. R. (2025). Credit Card Fraud Detection Through Explainable Artificial Intelligence for Managerial Oversight, Indatu Journal of Management and Accounting, Vol. 3, Nos. 1 SE-Articles, 17–28. doi:10.60084/ijma.v3i1.301.
- Fadlil, A., Herman, and Praseptian M, D. (2022). K Nearest Neighbor Imputation Performance on Missing Value Data Graduate User Satisfaction, Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), Vol. 6, No. 4, 570–576. doi:10.29207/resti.v6i4.4173.
- Noviandy, T. R., Idroes, G. M., Hardi, I., Afjal, M., and Ray, S. (2024). A Model-Agnostic Interpretability Approach to Predicting Customer Churn in the Telecommunications Industry, Infolitika Journal of Data Science, Vol. 2, No. 1, 34–44. doi:10.60084/ijds.v2i1.199.
- Rochim, A. F., Widyaningrum, K., and Eridani, D. (2021). Performance Comparison of Support Vector Machine Kernel Functions in Classifying COVID-19 Sentiment, 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 224–228. doi:10.1109/ISRITI54043.2021.9702845.
- Noviandy, T. R., Idroes, G. M., Mohd Fauzi, F., and Idroes, R. (2024). Application of Ensemble Machine Learning Methods for QSAR Classification of Leukotriene A4 Hydrolase Inhibitors in Drug Discovery, Malacca Pharmaceutics, Vol. 2, No. 2, 68–78. doi:10.60084/mp.v2i2.217.
- Noviandy, T. R., Maulana, A., Emran, T. B., Idroes, G. M., and Idroes, R. (2023). QSAR Classification of Beta-Secretase 1 Inhibitor Activity in Alzheimer’s Disease Using Ensemble Machine Learning Algorithms, Heca Journal of Applied Sciences, Vol. 1, No. 1, 1–7. doi:10.60084/hjas.v1i1.12.
- Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
- Ferrer, L. (2022). Analysis and Comparison of Classification Metrics, ArXiv Preprint ArXiv:2209.05355.
- Tharwat, A. (2021). Classification Assessment Methods, Applied Computing and Informatics, Vol. 17, No. 1, 168–192. doi:10.1016/j.aci.2018.08.003.
- Cook, J., and Ramadas, V. (2020). When to Consult Precision-Recall Curves, The Stata Journal: Promoting Communications on Statistics and Stata, Vol. 20, No. 1, 131–148. doi:10.1177/1536867X20909693.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Teuku Rizky Noviandy, Maria Paristiowati, Illyas Md Isa, Rinaldi Idroes

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.




















