Machine Learning Approach for Diabetes Detection Using Fine-Tuned XGBoost Algorithm

Authors

  • Aga Maulana Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Farassa Rani Faisal Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Teuku Rizky Noviandy Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Tatsa Rizkia General Practitioner, School of Medicine, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Ghazi Mauer Idroes Department of Occupational Health and Safety, Faculty of Health Sciences, Universitas Abulyatama, Aceh Besar 23372, Indonesia
  • Trina Ekawati Tallei Department of Biology, Faculty of Mathematics and Natural Sciences, Sam Ratulangi University, Manado 95115, North Sulawesi, Indonesia
  • Mohamed El-Shazly Department of Pharmacognosy, Faculty of Pharmacy, Ain-Shams University, Cairo 11566, Egypt
  • Rinaldi Idroes Department of Chemistry, Faculty of Mathematics and Natural Sciences Universitas Syiah Kuala, Banda Aceh 23111, Indonesia

DOI:

https://doi.org/10.60084/ijds.v1i1.72

Keywords:

Classification, Feature importance, Pima Indian, Hyperparameter tuning, Supervised learning

Abstract

Diabetes is a chronic condition characterized by elevated blood glucose levels which leads to organ dysfunction and an increased risk of premature death. The global prevalence of diabetes has been rising, necessitating an accurate and timely diagnosis to achieve the most effective management. Recent advancements in the field of machine learning have opened new possibilities for improving diabetes detection and management. In this study, we propose a fine-tuned XGBoost model for diabetes detection. We use the Pima Indian Diabetes dataset and employ a random search for hyperparameter tuning. The fine-tuned XGBoost model is compared with six other popular machine learning models and achieves the highest performance in accuracy, precision, sensitivity, and F1-score. This study demonstrates the potential of the fine-tuned XGBoost model as a robust and efficient tool for diabetes detection. The insights of this study advance medical diagnostics for efficient and personalized management of diabetes.

Downloads

Download data is not yet available.

References

  1. Vaishali, R., Sasikala, R., Ramasubbareddy, S., Remya, S., and Nalluri, S. (2017). Genetic algorithm based feature selection and MOE Fuzzy classification algorithm on Pima Indians Diabetes dataset, Proceedings of the IEEE International Conference on Computing, Networking and Informatics, ICCNI 2017, Vols 2017-Janua, 1–5. doi:10.1109/ICCNI.2017.8123815.
  2. Zimmet, P. Z., Magliano, D. J., Herman, W. H., and Shaw, J. E. (2014). Diabetes: a 21st century challenge, The Lancet Diabetes & Endocrinology, Vol. 2, No. 1, 56–64. doi:10.1016/S2213-8587(13)70112-8.
  3. Quazi, A., Patwekar, M., Patwekar, F., Alghamdi, S., Rajab, B. S., Babalghith, A. O., and Islam, F. (2022). In Vitro Alpha-Amylase Enzyme Assay of Hydroalcoholic Polyherbal Extract: Proof of Concept for the Development of Polyherbal Teabag Formulation for the Treatment of Diabetes, Evidence-Based Complementary and Alternative Medicine, Vol. 2022, 1577957. doi:10.1155/2022/1577957.
  4. Rao, Y. K., Lee, M.-J., Chen, K., Lee, Y.-C., Wu, W.-S., and Tzeng, Y.-M. (2011). Insulin-mimetic action of rhoifolin and cosmosiin isolated from Citrus grandis (L.) Osbeck leaves: enhanced adiponectin secretion and insulin receptor phosphorylation in 3T3-L1 cells, Evidence-Based Complementary and Alternative Medicine, Vol. 2011.
  5. Ye, W., Luo, C., Huang, J., Li, C., Liu, Z., and Liu, F. (2022). Gestational diabetes mellitus and adverse pregnancy outcomes: systematic review and meta-analysis, BMJ, e067946. doi:10.1136/bmj-2021-067946.
  6. Association, A. D. (n.d.). Diabetes Overview The path to understanding diabetes starts here.
  7. Hanson, M. A., Gluckman, P. D., Ma, R. C. W., Matzen, P., and Biesma, R. G. (2012). Early life opportunities for prevention of diabetes in low and middle income countries, BMC Public Health, Vol. 12, 1–9.
  8. Dunachie, S., and Chamnan, P. (2019). The double burden of diabetes and global infection in low and middle-income countries, Transactions of The Royal Society of Tropical Medicine and Hygiene, Vol. 113, No. 2, 56–64.
  9. Awah, P. K., Unwin, N., and Phillimore, P. (2008). Cure or control: complying with biomedical regime of diabetes in Cameroon, BMC Health Services Research, Vol. 8, No. 1, 43. doi:10.1186/1472-6963-8-43.
  10. Ahsan, M. M., Luna, S. A., and Siddique, Z. (2022). Machine-Learning-Based Disease Diagnosis: A Comprehensive Review, Healthcare, Vol. 10, No. 3, 541. doi:10.3390/healthcare10030541.
  11. Edeh, M. O., Khalaf, O. I., Tavera, C. A., Tayeb, S., Ghouali, S., Abdulsahib, G. M., Richard-Nnabu, N. E., and Louni, A. (2022). A Classification Algorithm-Based Hybrid Diabetes Prediction Model, Frontiers in Public Health, Vol. 10. doi:10.3389/fpubh.2022.829519.
  12. Kumar, P. S., K, A. K., Mohapatra, S., Naik, B., Nayak, J., and Mishra, M. (2021). CatBoost Ensemble Approach for Diabetes Risk Prediction at Early Stages, 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology(ODICON), IEEE, 1–6. doi:10.1109/ODICON50556.2021.9428943.
  13. Chang, V., Bailey, J., Xu, Q. A., and Sun, Z. (2022). Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms, Neural Computing and Applications. doi:10.1007/s00521-022-07049-z.
  14. Kumari, S., Kumar, D., and Mittal, M. (2021). An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier, International Journal of Cognitive Computing in Engineering, Vol. 2, 40–46. doi:10.1016/j.ijcce.2021.01.001.
  15. Smith, J. W., Everhart, J. E., Dickson, W. C., Knowler, W. C., and Johannes, R. S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus, Proceedings of the Annual Symposium on Computer Application in Medical Care, American Medical Informatics Association, 261.
  16. Jadhav, A., Pramod, D., and Ramanathan, K. (2019). Comparison of performance of data imputation methods for numeric dataset, Applied Artificial Intelligence, Vol. 33, No. 10, 913–933.
  17. Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
  18. Chen, T., and Guestrin, C. (2016). Xgboost: A scalable tree boosting system, Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–794. doi:10.1145/2939672.2939785.
  19. Alves, A. H. R., and Cerri, R. (2022). A Two-step Model for Drug-Target Interaction Prediction with Predictive Bi-Clustering Trees and XGBoost, 2022 International Joint Conference on Neural Networks (IJCNN), IEEE, 1–8.
  20. Maulana, A., Noviandy, T. R., Sasmita, N. R., Paristiowati, M., Suhendra, R., Yandri, E., Satrio, J., and Idroes, R. (2023). Optimizing University Admissions: A Machine Learning Perspective, Journal of Educational Management and Learning, Vol. 1, No. 1, 1–7. doi:10.60084/jeml.v1i1.46.
  21. Amjad, M., Ahmad, I., Ahmad, M., Wróblewski, P., Kamiński, P., and Amjad, U. (2022). Prediction of pile bearing capacity using XGBoost algorithm: modeling and performance evaluation, Applied Sciences, Vol. 12, No. 4, 2126.
  22. Li, M., Fu, X., and Li, D. (2020). Diabetes Prediction Based on XGBoost Algorithm, IOP Conference Series: Materials Science and Engineering, Vol. 768, No. 7, 072093. doi:10.1088/1757-899X/768/7/072093.
  23. Idroes, G. M., Maulana, A., Suhendra, R., Lala, A., Karma, T., Kusumo, F., Hewindati, Y. T., and Noviandy, T. R. (2023). TeutongNet: A Fine-Tuned Deep Learning Model for Improved Forest Fire Detection, Leuser Journal of Environmental Studies, Vol. 1, No. 1, 1–8. doi:10.60084/ljes.v1i1.42.
  24. Noviandy, T. R., Maulana, A., Emran, T. B., Idroes, G. M., and Idroes, R. (2023). QSAR Classification of Beta-Secretase 1 Inhibitor Activity in Alzheimer’s Disease Using Ensemble Machine Learning Algorithms, Heca Journal of Applied Sciences, Vol. 1, No. 1, 1–7. doi:10.60084/hjas.v1i1.12.

Downloads

Published

2023-08-22

How to Cite

Maulana, A., Faisal, F. R., Noviandy, T. R., Rizkia, T., Idroes, G. M., Tallei, T. E., El-Shazly, M., & Idroes, R. (2023). Machine Learning Approach for Diabetes Detection Using Fine-Tuned XGBoost Algorithm. Infolitika Journal of Data Science, 1(1), 1–7. https://doi.org/10.60084/ijds.v1i1.72