Predicting AXL Tyrosine Kinase Inhibitor Potency Using Machine Learning with Interpretable Insights for Cancer Drug Discovery

Teuku Rizky Noviandy; Ghifari Maulana Idroes; Essy Harnelly; Irma Sari; Fazlin Mohd Fauzi; Rinaldi Idroes

doi:10.60084/hjas.v3i1.270

Authors

Teuku Rizky Noviandy Department of Information Systems, Faculty of Engineering, Universitas Abulyatama, Aceh Besar 23372, Indonesia
Ghifari Maulana Idroes Department of Nuclear Engineering and Engineering Physics, Universitas Gadjah Mada, Yogyakarta 55281, Indonesia
Essy Harnelly Department of Biology, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
Irma Sari Department of Pharmacy, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
Fazlin Mohd Fauzi Faculty of Pharmacy, Universiti Teknologi MARA Selangor, Puncak Alam Campus, 42 300 Bandar Puncak Alam, Selangor, Malaysia
Rinaldi Idroes Department of Pharmacy, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia; School of Mathematics and Applied Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia

DOI:

https://doi.org/10.60084/hjas.v3i1.270

Keywords:

AXL tyrosine kinase, Machine learning, Drug discovery, SHAP analysis, Cancer therapeutics

Abstract

AXL tyrosine kinase plays a critical role in cancer progression, metastasis, and therapy resistance, making it a promising target for therapeutic intervention. However, traditional drug discovery methods for developing AXL inhibitors are resource-intensive, time-consuming, and often fail to provide detailed insights into molecular determinants of potency. To address this gap, we applied machine learning techniques, including Random Forest, Gradient Boosting, Support Vector Regression, and Decision Tree models, to predict the potency (pIC₅₀) of AXL inhibitors using a dataset of 972 compounds with 550 molecular descriptors. Our results demonstrate that the Random Forest model outperformed others with an R² of 0.703, MAE of 0.553, RMSE of 0.720, and PCC of 0.841, showcasing strong predictive accuracy. SHAP analysis identified critical molecular features, such as RNCG and TopoPSA(NO), as key contributors to inhibitor potency, providing interpretable insights into structure-activity relationships. These findings highlight the potential of machine learning to accelerate the identification and optimization of AXL inhibitors, bridging the gap between computational predictions and rational drug design and paving the way for effective cancer therapeutics.

Downloads

Download data is not yet available.

References

Zhu, C., Wei, Y., and Wei, X. (2019). AXL Receptor Tyrosine Kinase as a Promising Anti-cancer Approach: Functions, Molecular Mechanisms and Clinical Applications, Molecular Cancer, Vol. 18, No. 1, 153. doi:10.1186/s12943-019-1090-3.
Batur, T., Argundogan, A., Keles, U., Mutlu, Z., Alotaibi, H., Senturk, S., and Ozturk, M. (2021). AXL Knock-Out in SNU475 Hepatocellular Carcinoma Cells Provides Evidence for Lethal Effect Associated with G2 Arrest and Polyploidization, International Journal of Molecular Sciences, Vol. 22, No. 24, 13247. doi:10.3390/ijms222413247.
Yoshimura, A., Yamada, T., Serizawa, M., Uehara, H., Tanimura, K., Okuma, Y., Fukuda, A., Watanabe, S., Nishioka, N., Takeda, T., Chihara, Y., Takemoto, S., Harada, T., Hiranuma, O., Shirai, Y., Shukuya, T., Nishiyama, A., Goto, Y., Shiotsu, S., Kunimasa, K., Morimoto, K., Katayama, Y., Suda, K., Mitsudomi, T., Yano, S., Kenmotsu, H., Takahashi, T., and Takayama, K. (2023). High Levels of AXL Expression in Untreated EGFR-Mutated Non-Small Cell Lung Cancer Negatively Impacts the Use of Osimertinib, Cancer Science, Vol. 114, No. 2, 606–618. doi:10.1111/cas.15608.
Ozyurt, R., and Ozpolat, B. (2023). Therapeutic Landscape of AXL Receptor Kinase in Triple-Negative Breast Cancer, Molecular Cancer Therapeutics, Vol. 22, No. 7, 818–832. doi:10.1158/1535-7163.MCT-22-0617.
Tang, Y., Zang, H., Wen, Q., and Fan, S. (2023). AXL in Cancer: A Modulator of Drug Resistance and Therapeutic Target, Journal of Experimental & Clinical Cancer Research, Vol. 42, No. 1, 148. doi:10.1186/s13046-023-02726-w.
Phatak, S. S., Stephan, C. C., and Cavasotto, C. N. (2009). High-Throughput and In Silico Screenings in Drug Discovery, Expert Opinion on Drug Discovery, Vol. 4, No. 9, 947–959. doi:10.1517/17460440903190961.
van Dongen, M., Weigelt, J., Uppenberg, J., Schultz, J., and Wikström, M. (2002). Structure-Based Screening and Design in Drug Discovery, Drug Discovery Today, Vol. 7, No. 8, 471–478. doi:10.1016/S1359-6446(02)02233-X.
Mukherjee, D., Sharma, T., Lunawat, A. K., Awasthi, A., Kurmi, B. Das, Kumar, M., Gupta, G. Das, and Thakur, S. (2024). Advancement of Artificial Intelligence in Drug Discovery: A Comprehensive Review, Current Artificial Intelligence, Vol. 03. doi:10.2174/0129503752322569241104114248.
Lynch, C., Sakamuru, S., Ooka, M., Huang, R., Klumpp-Thomas, C., Shinn, P., Gerhold, D., Rossoshek, A., Michael, S., Casey, W., Santillo, M. F., Fitzpatrick, S., Thomas, R. S., Simeonov, A., and Xia, M. (2024). High-Throughput Screening to Advance In Vitro Toxicology: Accomplishments, Challenges, and Future Directions, Annual Review of Pharmacology and Toxicology, Vol. 64, No. 1, 191–209. doi:10.1146/annurev-pharmtox-112122-104310.
Tiwari, P. C., Pal, R., Chaudhary, M. J., and Nath, R. (2023). Artificial intelligence revolutionizing drug development: Exploring opportunities and challenges, Drug Development Research, Vol. 84, No. 8, 1652–1663. doi:10.1002/ddr.22115.
Dara, S., Dhamercherla, S., Jadav, S. S., Babu, C. M., and Ahsan, M. J. (2022). Machine Learning in Drug Discovery: A Review, Artificial Intelligence Review, Vol. 55, No. 3, 1947–1999. doi:10.1007/s10462-021-10058-4.
Noviandy, T. R., Idroes, G. M., Mohd Fauzi, F., and Idroes, R. (2024). Application of Ensemble Machine Learning Methods for QSAR Classification of Leukotriene A4 Hydrolase Inhibitors in Drug Discovery, Malacca Pharmaceutics, Vol. 2, No. 2, 68–78. doi:10.60084/mp.v2i2.217.
Kumar, S. A., Ananda Kumar, T. D., Beeraka, N. M., Pujar, G. V., Singh, M., Narayana Akshatha, H. S., and Bhagyalalitha, M. (2022). Machine Learning and Deep Learning in Data-Driven Decision Making of Drug Discovery and Challenges in High-Quality Data Acquisition in the Pharmaceutical Industry, Future Medicinal Chemistry, Vol. 14, No. 4, 245–270. doi:10.4155/fmc-2021-0243.
Priya, S., Tripathi, G., Singh, D. B., Jain, P., and Kumar, A. (2022). Machine Learning Approaches and Their Applications in Drug Discovery and Design, Chemical Biology & Drug Design, Vol. 100, No. 1, 136–153. doi:10.1111/cbdd.14057.
Noviandy, T. R., Idroes, G. M., and Hardi, I. (2024). An Interpretable Machine Learning Strategy for Antimalarial Drug Discovery with LightGBM and SHAP, Journal of Future Artificial Intelligence and Technologies, Vol. 1, No. 2, 84–95. doi:10.62411/faith.2024-16.
Zhang, Y., Xu, F., Zou, J., Petrosian, O. L., and Krinkin, K. V. (2021). XAI Evaluation: Evaluating Black-Box Model Explanations for Prediction, 2021 II International Conference on Neural Networks and Neurotechnologies (NeuroNT), IEEE, 13–16. doi:10.1109/NeuroNT53022.2021.9472817.
Gaulton, A., Bellis, L. J., Bento, A. P., Chambers, J., Davies, M., Hersey, A., Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., and Overington, J. P. (2012). ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery, Nucleic Acids Research, Vol. 40, No. D1, D1100–D1107. doi:10.1093/nar/gkr777.
Wigh, D. S., Goodman, J. M., and Lapkin, A. A. (2022). A Review of Molecular Representation in the Age of Machine Learning, WIREs Computational Molecular Science, Vol. 12, No. 5. doi:10.1002/wcms.1603.
Thakur, A., Kumar, A., Sharma, V. kumar, and Mehta, V. (2022). PIC50: An Open Source Tool for Interconversion of PIC50 Values and IC50 for Efficient Data Representation and Analysis, BioRxiv, 2010–2022.
Grisoni, F., Consonni, V., and Todeschini, R. (2018). Impact of Molecular Descriptors on Computational Models, 171–209. doi:10.1007/978-1-4939-8639-2_5.
Moriwaki, H., Tian, Y. S., Kawashita, N., and Takagi, T. (2018). Mordred: A Molecular Descriptor Calculator, Journal of Cheminformatics, Vol. 10, No. 1, 1–14. doi:10.1186/s13321-018-0258-y.
Ojha, P. K., and Roy, K. (2011). Comparative QSARs for Antimalarial Endochins: Importance of Descriptor-Thinning and Noise Reduction Prior to Feature Selection, Chemometrics and Intelligent Laboratory Systems, Vol. 109, No. 2, 146–161. doi:10.1016/j.chemolab.2011.08.007.
Yang, H., Du, Z., Lv, W.-J., Zhang, X.-Y., and Zhai, H.-L. (2019). In Silico Toxicity Evaluation of Dioxins Using Structure–Activity Relationship (SAR) and Two-Dimensional Quantitative Structure–Activity Relationship (2D-QSAR), Archives of Toxicology, Vol. 93, No. 11, 3207–3218. doi:10.1007/s00204-019-02580-w.
Noviandy, T. R., Maulana, A., Idroes, G. M., Suhendra, R., Afidh, R. P. F., and Idroes, R. (2024). An Explainable Multi-Model Stacked Classifier Approach for Predicting Hepatitis C Drug Candidates, Sci, Vol. 6, No. 4, 81. doi:10.3390/sci6040081.
Soni, J., Prabakar, N., and Upadhyay, H. (2020). Visualizing High-Dimensional Data Using t-Distributed Stochastic Neighbor Embedding Algorithm, 189–206. doi:10.1007/978-3-030-43981-1_9.
Noviandy, T. R., Maulana, A., Idroes, G. M., Emran, T. B., Tallei, T. E., Helwani, Z., and Idroes, R. (2023). Ensemble Machine Learning Approach for Quantitative Structure Activity Relationship Based Drug Discovery: A Review, Infolitika Journal of Data Science, Vol. 1, No. 1, 32–41. doi:10.60084/ijds.v1i1.91.
Noviandy, T. R., Idroes, G. M., Maulana, A., Afidh, R. P. F., and Idroes, R. (2024). Optimizing Hepatitis C Virus Inhibitor Identification with LightGBM and Tree-structured Parzen Estimator Sampling, Engineering, Technology & Applied Science Research, Vol. 14, No. 6, 18810–18817. doi:10.48084/etasr.8947.
Suhendra, R., Husdayanti, N., Suryadi, S., Juliwardi, I., Sanusi, S., Ridho, A., Ardiansyah, M., Murhaban, M., and Ikhsan, I. (2023). Cardiovascular Disease Prediction Using Gradient Boosting Classifier, Infolitika Journal of Data Science, Vol. 1, No. 2, 56–62. doi:10.60084/ijds.v1i2.131.
Noviandy, T. R., Idroes, G. M., Tallei, T. E., Handayani, D., and Idroes, R. (2024). QSAR Modeling for Predicting Beta-Secretase 1 Inhibitory Activity in Alzheimer’s Disease with Support Vector Regression, Malacca Pharmaceutics, Vol. 2, No. 2, 79–85. doi:10.60084/mp.v2i2.226.
Sasmita, N. R., Ramadeska, S., Kesuma, Z. M., Noviandy, T. R., Maulana, A., Khairul, M., and Suhendra, R. (2024). Decision Tree versus k-NN: A Performance Comparison for Air Quality Classification in Indonesia, Infolitika Journal of Data Science, Vol. 2, No. 1, 9–16. doi:10.60084/ijds.v2i1.179.
Winarsih, A., Idroes, R., Zulfiani, U., Yusuf, M., Mahmudi, M., Saiful, S., and Rahman, S. A. (2023). Method Validation for Pesticide Residues on Rice Grain in Aceh Besar District, Indonesia Using Gas Chromatography-Electron Capture Detector (GC-ECD), Leuser Journal of Environmental Studies, Vol. 1, No. 1, 18–24. doi:10.60084/ljes.v1i1.37.
Maulana, A., Idroes, G. M., Kemala, P., Maulydia, N. B., Sasmita, N. R., Tallei, T. E., Sofyan, H., and Rusyana, A. (2023). Leveraging Artificial Intelligence to Predict Student Performance: A Comparative Machine Learning Approach, Journal of Educational Management and Learning, Vol. 1, No. 2, 64–70. doi:10.60084/jeml.v1i2.132.
Noviandy, T. R., Maulana, A., Irvanizam, I., Idroes, G. M., Maulydia, N. B., Tallei, T. E., Subianto, M., and Idroes, R. (2025). Interpretable Machine Learning Approach to Predict Hepatitis C Virus NS5B Inhibitor Activity Using Voting-Based LightGBM and SHAP, Intelligent Systems with Applications, Vol. 25, 200481. doi:10.1016/j.iswa.2025.200481.
Wu, Y., Huo, D., Chen, G., and Yan, A. (2021). SAR and QSAR Research on Tyrosinase Inhibitors Using Machine Learning Methods, SAR and QSAR in Environmental Research, Vol. 32, No. 2, 85–110. doi:10.1080/1062936X.2020.1862297.
Lundberg, S. M., and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions, Advances in Neural Information Processing Systems, Vol. 30.
Noviandy, T. R., Idroes, G. M., Syukri, M., and Idroes, R. (2024). Interpretable Machine Learning for Chronic Kidney Disease Diagnosis: A Gaussian Processes Approach, Indonesian Journal of Case Reports, Vol. 2, No. 1, 24–32. doi:10.60084/ijcr.v2i1.204.
Moncada-Torres, A., van Maaren, M. C., Hendriks, M. P., Siesling, S., and Geleijnse, G. (2021). Explainable Machine Learning Can Outperform Cox Regression Predictions and Provide Insights in Breast Cancer Survival, Scientific Reports, Vol. 11, No. 1, 6968. doi:10.1038/s41598-021-86327-7.
Iqbal, A. B., Masoodi, T. A., Bhat, A. A., Macha, M. A., Assad, A., and Shah, S. Z. A. (2025). Explainable AI-Driven Prediction of APE1 Inhibitors: Enhancing Cancer Therapy with Machine Learning Models and Feature Importance Analysis, Molecular Diversity. doi:10.1007/s11030-025-11133-6.
Noviandy, T. R., Idroes, G. M., and Hardi, I. (2025). Integrating Explainable Artificial Intelligence and Light Gradient Boosting Machine for Glioma Grading, Informatics and Health, Vol. 2, No. 1, 1–8. doi:10.1016/j.infoh.2024.12.001.
Prendin, F., Pavan, J., Cappon, G., Del Favero, S., Sparacino, G., and Facchinetti, A. (2023). The Importance of Interpreting Machine Learning Models for Blood Glucose Prediction in Diabetes: An Analysis Using SHAP, Scientific Reports, Vol. 13, No. 1, 16865. doi:10.1038/s41598-023-44155-x.