Machine Learning Approach for Diabetes Detection Using Fine-Tuned XGBoost Algorithm
DOI:
https://doi.org/10.60084/ijds.v1i1.72Keywords:
Classification, Feature importance, Pima Indian, Hyperparameter tuning, Supervised learningAbstract
Diabetes is a chronic condition characterized by elevated blood glucose levels which leads to organ dysfunction and an increased risk of premature death. The global prevalence of diabetes has been rising, necessitating an accurate and timely diagnosis to achieve the most effective management. Recent advancements in the field of machine learning have opened new possibilities for improving diabetes detection and management. In this study, we propose a fine-tuned XGBoost model for diabetes detection. We use the Pima Indian Diabetes dataset and employ a random search for hyperparameter tuning. The fine-tuned XGBoost model is compared with six other popular machine learning models and achieves the highest performance in accuracy, precision, sensitivity, and F1-score. This study demonstrates the potential of the fine-tuned XGBoost model as a robust and efficient tool for diabetes detection. The insights of this study advance medical diagnostics for efficient and personalized management of diabetes.
Downloads
References
- Vaishali, R., Sasikala, R., Ramasubbareddy, S., Remya, S., and Nalluri, S. (2017). Genetic algorithm based feature selection and MOE Fuzzy classification algorithm on Pima Indians Diabetes dataset, Proceedings of the IEEE International Conference on Computing, Networking and Informatics, ICCNI 2017, Vols 2017-Janua, 1–5. doi:10.1109/ICCNI.2017.8123815.
- Zimmet, P. Z., Magliano, D. J., Herman, W. H., and Shaw, J. E. (2014). Diabetes: a 21st century challenge, The Lancet Diabetes & Endocrinology, Vol. 2, No. 1, 56–64. doi:10.1016/S2213-8587(13)70112-8.
- Quazi, A., Patwekar, M., Patwekar, F., Alghamdi, S., Rajab, B. S., Babalghith, A. O., and Islam, F. (2022). In Vitro Alpha-Amylase Enzyme Assay of Hydroalcoholic Polyherbal Extract: Proof of Concept for the Development of Polyherbal Teabag Formulation for the Treatment of Diabetes, Evidence-Based Complementary and Alternative Medicine, Vol. 2022, 1577957. doi:10.1155/2022/1577957.
- Rao, Y. K., Lee, M.-J., Chen, K., Lee, Y.-C., Wu, W.-S., and Tzeng, Y.-M. (2011). Insulin-mimetic action of rhoifolin and cosmosiin isolated from Citrus grandis (L.) Osbeck leaves: enhanced adiponectin secretion and insulin receptor phosphorylation in 3T3-L1 cells, Evidence-Based Complementary and Alternative Medicine, Vol. 2011.
- Ye, W., Luo, C., Huang, J., Li, C., Liu, Z., and Liu, F. (2022). Gestational diabetes mellitus and adverse pregnancy outcomes: systematic review and meta-analysis, BMJ, e067946. doi:10.1136/bmj-2021-067946.
- Association, A. D. (n.d.). Diabetes Overview The path to understanding diabetes starts here.
- Hanson, M. A., Gluckman, P. D., Ma, R. C. W., Matzen, P., and Biesma, R. G. (2012). Early life opportunities for prevention of diabetes in low and middle income countries, BMC Public Health, Vol. 12, 1–9.
- Dunachie, S., and Chamnan, P. (2019). The double burden of diabetes and global infection in low and middle-income countries, Transactions of The Royal Society of Tropical Medicine and Hygiene, Vol. 113, No. 2, 56–64.
- Awah, P. K., Unwin, N., and Phillimore, P. (2008). Cure or control: complying with biomedical regime of diabetes in Cameroon, BMC Health Services Research, Vol. 8, No. 1, 43. doi:10.1186/1472-6963-8-43.
- Ahsan, M. M., Luna, S. A., and Siddique, Z. (2022). Machine-Learning-Based Disease Diagnosis: A Comprehensive Review, Healthcare, Vol. 10, No. 3, 541. doi:10.3390/healthcare10030541.
- Edeh, M. O., Khalaf, O. I., Tavera, C. A., Tayeb, S., Ghouali, S., Abdulsahib, G. M., Richard-Nnabu, N. E., and Louni, A. (2022). A Classification Algorithm-Based Hybrid Diabetes Prediction Model, Frontiers in Public Health, Vol. 10. doi:10.3389/fpubh.2022.829519.
- Kumar, P. S., K, A. K., Mohapatra, S., Naik, B., Nayak, J., and Mishra, M. (2021). CatBoost Ensemble Approach for Diabetes Risk Prediction at Early Stages, 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology(ODICON), IEEE, 1–6. doi:10.1109/ODICON50556.2021.9428943.
- Chang, V., Bailey, J., Xu, Q. A., and Sun, Z. (2022). Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms, Neural Computing and Applications. doi:10.1007/s00521-022-07049-z.
- Kumari, S., Kumar, D., and Mittal, M. (2021). An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier, International Journal of Cognitive Computing in Engineering, Vol. 2, 40–46. doi:10.1016/j.ijcce.2021.01.001.
- Smith, J. W., Everhart, J. E., Dickson, W. C., Knowler, W. C., and Johannes, R. S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus, Proceedings of the Annual Symposium on Computer Application in Medical Care, American Medical Informatics Association, 261.
- Jadhav, A., Pramod, D., and Ramanathan, K. (2019). Comparison of performance of data imputation methods for numeric dataset, Applied Artificial Intelligence, Vol. 33, No. 10, 913–933.
- Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
- Chen, T., and Guestrin, C. (2016). Xgboost: A scalable tree boosting system, Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–794. doi:10.1145/2939672.2939785.
- Alves, A. H. R., and Cerri, R. (2022). A Two-step Model for Drug-Target Interaction Prediction with Predictive Bi-Clustering Trees and XGBoost, 2022 International Joint Conference on Neural Networks (IJCNN), IEEE, 1–8.
- Maulana, A., Noviandy, T. R., Sasmita, N. R., Paristiowati, M., Suhendra, R., Yandri, E., Satrio, J., and Idroes, R. (2023). Optimizing University Admissions: A Machine Learning Perspective, Journal of Educational Management and Learning, Vol. 1, No. 1, 1–7. doi:10.60084/jeml.v1i1.46.
- Amjad, M., Ahmad, I., Ahmad, M., Wróblewski, P., Kamiński, P., and Amjad, U. (2022). Prediction of pile bearing capacity using XGBoost algorithm: modeling and performance evaluation, Applied Sciences, Vol. 12, No. 4, 2126.
- Li, M., Fu, X., and Li, D. (2020). Diabetes Prediction Based on XGBoost Algorithm, IOP Conference Series: Materials Science and Engineering, Vol. 768, No. 7, 072093. doi:10.1088/1757-899X/768/7/072093.
- Idroes, G. M., Maulana, A., Suhendra, R., Lala, A., Karma, T., Kusumo, F., Hewindati, Y. T., and Noviandy, T. R. (2023). TeutongNet: A Fine-Tuned Deep Learning Model for Improved Forest Fire Detection, Leuser Journal of Environmental Studies, Vol. 1, No. 1, 1–8. doi:10.60084/ljes.v1i1.42.
- Noviandy, T. R., Maulana, A., Emran, T. B., Idroes, G. M., and Idroes, R. (2023). QSAR Classification of Beta-Secretase 1 Inhibitor Activity in Alzheimer’s Disease Using Ensemble Machine Learning Algorithms, Heca Journal of Applied Sciences, Vol. 1, No. 1, 1–7. doi:10.60084/hjas.v1i1.12.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Aga Maulana, Farassa Rani Faisal, Teuku Rizky Noviandy, Tatsa Rizkia, Ghazi Mauer Idroes, Trina Ekawati Tallei, Mohamed El-Shazly, Rinaldi Idroes

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.




















