Advanced Anemia Classification Using Comprehensive Hematological Profiles and Explainable Machine Learning Approaches
DOI:
https://doi.org/10.60084/ijds.v2i2.237Keywords:
Hematological analysis, Data imbalance, Predictive algorithms, Clinical diagnostics, Health informaticsAbstract
Anemia is a common health issue with serious clinical effects, making timely and accurate diagnosis essential to prevent complications. This study explores the use of machine learning (ML) methods to classify anemia and its subtypes using detailed hematological data. Six ML models were tested: Gradient Boosting, Random Forest, Naive Bayes, Logistic Regression, Support Vector Machine, and K-Nearest Neighbors. The dataset was preprocessed using feature standardization and the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. Gradient Boosting delivered the highest accuracy, sensitivity, and F1-score, establishing itself as the top-performing model. SHapley Additive exPlanations (SHAP) analysis was applied to enhance model interpretability, identifying key predictive features. This study highlights the potential of explainable ML to develop efficient, accurate, and scalable tools for anemia diagnosis, fostering improved healthcare outcomes globally.
Downloads
References
- Garcia‐Casal, M. N., Dary, O., Jefferds, M. E., and Pasricha, S. (2023). Diagnosing Anemia: Challenges Selecting Methods, Addressing Underlying Causes, and Implementing Actions at the Public Health Level, Annals of the New York Academy of Sciences, Vol. 1524, No. 1, 37–50. doi:10.1111/nyas.14996.
- Simon, G. I., Craswell, A., Thom, O., Chew, M. S., Anstey, C. M., and Fung, Y. L. (2019). Impacts of Aging on Anemia Tolerance, Transfusion Thresholds, and Patient Blood Management, Transfusion Medicine Reviews, Vol. 33, No. 3, 154–161. doi:10.1016/j.tmrv.2019.03.001.
- Shah, S. A., Soomro, U., Ali, O., Tariq, Y., Waleed, M. S., Guntipalli, P., and Younus, N. (2023). The Prevalence of Anemia in Working Women, Cureus. doi:10.7759/cureus.44104.
- He, W., Ruan, Y., Yuan, C., Luan, X., and He, J. (2020). Hemoglobin, Anemia, and Poststroke Cognitive Impairment: A Cohort Study, International Journal of Geriatric Psychiatry, Vol. 35, No. 5, 564–571. doi:10.1002/gps.5272.
- Wiciński, M., Liczner, G., Cadelski, K., Kołnierzak, T., Nowaczewska, M., and Malinowski, B. (2020). Anemia of Chronic Diseases: Wider Diagnostics—Better Treatment?, Nutrients, Vol. 12, No. 6, 1784. doi:10.3390/nu12061784.
- Samson, K. L. I., Fischer, J. A. J., and Roche, M. L. (2022). Iron Status, Anemia, and Iron Interventions and Their Associations with Cognitive and Academic Performance in Adolescents: A Systematic Review, Nutrients, Vol. 14, No. 1, 224. doi:10.3390/nu14010224.
- van Haalen, H., Jackson, J., Spinowitz, B., Milligan, G., and Moon, R. (2020). Impact of Chronic Kidney Disease and Anemia on Health-Related Quality of Life and Work Productivity: Analysis of Multinational Real-World Data, BMC Nephrology, Vol. 21, No. 1, 88. doi:10.1186/s12882-020-01746-4.
- Noviandy, T. R., Nainggolan, S. I., Raihan, R., Firmansyah, I., and Idroes, R. (2023). Maternal Health Risk Detection Using Light Gradient Boosting Machine Approach, Infolitika Journal of Data Science, Vol. 1, No. 2, 48–55. doi:10.60084/ijds.v1i2.123.
- Kabir, M. A., Rahman, M. M., and Khan, M. N. (2022). Maternal Anemia and Risk of Adverse Maternal Health and Birth Outcomes in Bangladesh: A Nationwide Population-Based Survey, PLOS ONE, Vol. 17, No. 12, e0277654. doi:10.1371/journal.pone.0277654.
- Hemoglobinometry, A., Red, C., Histogram, E., and Width, R. C. D. (2015). Principles and Practice of Clinical Hematology, Linne & Ringsrud’s Clinical Laboratory Science-E-Book: The Basics and Routine Techniques, Vol. 2, 291.
- Said, A. S., Spinella, P. C., Hartman, M. E., Steffen, K. M., Jackups, R., Holubkov, R., Wallendorf, M., and Doctor, A. (2017). RBC Distribution Width: Biomarker for Red Cell Dysfunction and Critical Illness Outcome?, Pediatric Critical Care Medicine, Vol. 18, No. 2, 134–142. doi:10.1097/PCC.0000000000001017.
- Solomon, D. D., Khan, S., Garg, S., Gupta, G., Almjally, A., Alabduallah, B. I., Alsagri, H. S., Ibrahim, M. M., and Abdallah, A. M. A. (2023). Hybrid Majority Voting: Prediction and Classification Model for Obesity, Diagnostics, Vol. 13, No. 15, 2610. doi:10.3390/diagnostics13152610.
- Suhendra, R., Suryadi, S., Husdayanti, N., Maulana, A., Noviandy, T. R., Sasmita, N. R., Subianto, M., Earlia, N., Niode, N. J., and Idroes, R. (2023). Evaluation of Gradient Boosted Classifier in Atopic Dermatitis Severity Score Classification, Heca Journal of Applied Sciences, Vol. 1, No. 2, 54–61. doi:10.60084/hjas.v1i2.85.
- Noviandy, T. R., Alfanshury, M. H., Abidin, T. F., and Riza, H. (2023). Enhancing Glioma Grading Performance: A Comparative Study on Feature Selection Techniques and Ensemble Machine Learning, 2023 International Conference on Computer, Control, Informatics and Its Applications (IC3INA), IEEE, 406–411. doi:10.1109/IC3INA60834.2023.10285778.
- Noviandy, T. R., Nisa, K., Idroes, G. M., Hardi, I., and Sasmita, N. R. (2024). Classifying Beta-Secretase 1 Inhibitor Activity for Alzheimer’s Drug Discovery with LightGBM, Journal of Computing Theories and Applications, Vol. 2, No. 2, 138–147. doi:10.62411/jcta.10129.
- Rufo, D. D., Debelee, T. G., Ibenthal, A., and Negera, W. G. (2021). Diagnosis of Diabetes Mellitus Using Gradient Boosting Machine (LightGBM), Diagnostics, Vol. 11, No. 9, 1714. doi:10.3390/diagnostics11091714.
- Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
- Airlangga, G. (2024). Leveraging Machine Learning for Accurate Anemia Diagnosis Using Complete Blood Count Data, Indonesian Journal of Artificial Intelligence and Data Mining, Vol. 7, No. 2, 318. doi:10.24014/ijaidm.v7i2.29869.
- Ramzan, M., Sheng, J., Saeed, M. U., Wang, B., and Duraihem, F. Z. (2024). Revolutionizing Anemia Detection: Integrative Machine Learning Models and Advanced Attention Mechanisms, Visual Computing for Industry, Biomedicine, and Art, Vol. 7, No. 1, 18. doi:10.1186/s42492-024-00169-4.
- Antoniadi, A. M., Du, Y., Guendouz, Y., Wei, L., Mazo, C., Becker, B. A., and Mooney, C. (2021). Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review, Applied Sciences, Vol. 11, No. 11, 5088. doi:10.3390/app11115088.
- Ali, S., Akhlaq, F., Imran, A. S., Kastrati, Z., Daudpota, S. M., and Moosa, M. (2023). The Enlightening Role of Explainable Artificial Intelligence in Medical & Healthcare Domains: A Systematic Literature Review, Computers in Biology and Medicine, Vol. 166, 107555. doi:10.1016/j.compbiomed.2023.107555.
- Lundberg, S. M., and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions, Advances in Neural Information Processing Systems, Vol. 30.
- Nohara, Y., Matsumoto, K., Soejima, H., and Nakashima, N. (2022). Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital, Computer Methods and Programs in Biomedicine, Vol. 214, 106584. doi:10.1016/j.cmpb.2021.106584.
- Gramegna, A., and Giudici, P. (2021). SHAP and LIME: An Evaluation of Discriminative Power in Credit Risk, Frontiers in Artificial Intelligence, Vol. 4. doi:10.3389/frai.2021.752558.
- Vohra, R., Pahareeya, J., and Hussain, A. (2021). Complete Blood Count Anemia Diagnosis, Mendeley Data. doi:10.17632/dy9mfjchm7.1.
- Gunda, T., Hackett, S., Kraus, L., Downs, C., Jones, R., McNalley, C., Bolen, M., and Walker, A. (2020). A Machine Learning Evaluation of Maintenance Records for Common Failure Modes in PV Inverters, IEEE Access, Vol. 8, 211610–211620. doi:10.1109/ACCESS.2020.3039182.
- Noviandy, T. R., Idroes, G. M., Mohd Fauzi, F., and Idroes, R. (2024). Application of Ensemble Machine Learning Methods for QSAR Classification of Leukotriene A4 Hydrolase Inhibitors in Drug Discovery, Malacca Pharmaceutics, Vol. 2, No. 2, 68–78. doi:10.60084/mp.v2i2.217.
- Chawla, N. V, Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-Sampling Technique, Journal of Artificial Intelligence Research, Vol. 16, 321–357.
- Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., and Idroes, R. (2023). Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques, Indatu Journal of Management and Accounting, Vol. 1, No. 1, 29–35. doi:10.60084/ijma.v1i1.78.
- Berrar, D. (2019). Cross-Validation, Encyclopedia of Bioinformatics and Computational Biology, Elsevier, 542–545. doi:10.1016/B978-0-12-809633-8.20349-X.
- Noviandy, T. R., Zahriah, Z., Yandri, E., Jalil, Z., Yusuf, M., Yusof, N. I. S. M., Lala, A., and Idroes, R. (2024). Machine Learning for Early Detection of Dropout Risks and Academic Excellence: A Stacked Classifier Approach, Journal of Educational Management and Learning, Vol. 2, No. 1, 28–34. doi:10.60084/jeml.v2i1.191.
- Pratyusha, M., and Kanimozhi, K. V. (2022). Heart Disease Prediction Using Decision Tree in Comparison with K-Nearest Neighbor to Improve Accuracy, Advances in Parallel Computing, Vol. 0, No. 41, 231–236. doi:10.3233/APC220031.
- Idroes, G. M., Noviandy, T. R., Maulana, A., Zahriah, Z., Suhendrayatna, S., Suhartono, E., Khairan, K., Kusumo, F., Helwani, Z., and Abd Rahman, S. (2023). Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring, Leuser Journal of Environmental Studies, Vol. 1, No. 2, 62–68. doi:10.60084/ljes.v1i2.99.
- Magazzino, C., Madaleno, M., Waqas, M., and Leogrande, A. (2024). Exploring the Determinants of Methane Emissions from a Worldwide Perspective Using Panel Data and Machine Learning Analyses, Environmental Pollution, Vol. 348, 123807. doi:10.1016/j.envpol.2024.123807.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Teuku Rizky Noviandy, Ghifari Maulana Idroes, Rivansyah Suhendra, Tedy Kurniawan Bakri, Rinaldi Idroes
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.