Backpropagation Neural Network-Based Prediction of Kovats Retention Index for Essential Oil Compounds


  • Aulia Al-Jihad Safhadi Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Teuku Rizky Noviandy Interdisciplinary Innovation Research Unit, Graha Primera Saintifika, Aceh Besar 23771, Indonesia
  • Irvanizam Irvanizam Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Rivansyah Suhendra Department of Information Technology, Faculty of Engineering, Universitas Teuku Umar, Aceh
  • Taufiq Karma Department of Occupational Health and Safety, Faculty of Health Sciences, Universitas Abulyatama, Aceh Besar 23372, Indonesia
  • Rinaldi Idroes Department of Chemistry, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia



ANN, Machine learning, Supervised learning, Gas chromatography


The identification of chemical compounds in essential oils is crucial in industries such as pharmaceuticals, perfumery, and food. Kovats Retention Index (RI) values are essential for compound identification using gas chromatography-mass spectrometry (GC-MS). Traditional RI determination methods are time-consuming, labor-intensive, and susceptible to experimental variability. Recent advancements in data science suggest that artificial intelligence (AI) can enhance RI prediction accuracy and efficiency. However, the full potential of AI, particularly artificial neural networks (ANN), in predicting RI values remains underexplored. This study develops a backpropagation neural network (BPNN) model to predict the Kovats RI values of essential oil compounds using five molecular descriptors: ATSc1, VCH-7, SP-1, Kier1, and MLogP. We trained the BPNN on a dataset of 340 essential oil compounds and optimized it through hyperparameter tuning. We show that the optimized BPNN model, with an epoch count of 100, a learning rate of 0.1, a hidden layer size of 10 neurons, and the ReLU activation function, achieves an R² value of 0.934 and a Root Mean Squared Error (RMSE) of 76.98. These results indicate a high correlation between predicted and actual RI values and a low average prediction error. Our findings demonstrate that BPNNs can significantly improve the efficiency and accuracy of compound identification, reducing reliance on traditional experimental methods.


Download data is not yet available.


  1. Babushok, V. I. (2015). Chromatographic Retention Indices in Identification of Chemical Compounds, TrAC - Trends in Analytical Chemistry, Vol. 69, 98–104. doi:10.1016/j.trac.2015.04.001.
  2. Zenkevich, I. G. (2010). Kovats’ Retention Index System, Encyclopedia of Chromatography, Vol. 2, 1304–1310.
  3. Goodner, K. L. (2008). Practical Retention Index Models of OV-101, DB-1, DB-5, and DB-Wax for Flavor and Fragrance Compounds, LWT - Food Science and Technology, Vol. 41, No. 6, 951–958. doi:10.1016/j.lwt.2007.07.007.
  4. Strehmel, N., Hummel, J., Erban, A., Strassburg, K., and Kopka, J. (2008). Retention Index Thresholds for Compound Matching in GC–MS Metabolite Profiling, Journal of Chromatography B, Vol. 871, No. 2, 182–190. doi:10.1016/j.jchromb.2008.04.042.
  5. von Mühlen, C., and Marriott, P. J. (2011). Retention Indices in Comprehensive Two-Dimensional Gas Chromatography, Analytical and Bioanalytical Chemistry, Vol. 401, No. 8, 2351–2360. doi:10.1007/s00216-011-5247-1.
  6. Qu, C., Schneider, B. I., Kearsley, A. J., Keyrouz, W., and Allison, T. C. (2021). Predicting Kováts Retention Indices Using Graph Neural Networks, Journal of Chromatography A, Vol. 1646, 462100. doi:10.1016/j.chroma.2021.462100.
  7. Goel, P., Bapat, S., Vyas, R., Tambe, A., and Tambe, S. S. (2015). Genetic Programming Based Quantitative Structure–Retention Relationships for the Prediction of Kovats Retention Indices, Journal of Chromatography A, Vol. 1420, 98–109. doi:10.1016/j.chroma.2015.09.086.
  8. Lyu, Z., Yu, Y., Samali, B., Rashidi, M., Mohammadi, M., Nguyen, T. N., and Nguyen, A. (2022). Back-Propagation Neural Network Optimized by K-Fold Cross-Validation for Prediction of Torsional Strength of Reinforced Concrete Beam, Materials, Vol. 15, No. 4, 1477. doi:10.3390/ma15041477.
  9. Nguyen, T.-A., Ly, H.-B., and Pham, B. T. (2020). Backpropagation Neural Network-Based Machine Learning Model for Prediction of Soil Friction Angle, Mathematical Problems in Engineering, Vol. 2020, 1–11. doi:10.1155/2020/8845768.
  10. Ayyappa, Y., Neelakanteswara, P., Bekkanti, A., Tondeti, Y., and Basha, C. Z. (2021). Automatic Face Mask Recognition System With FCM AND BPNN, 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), IEEE, 1134–1137. doi:10.1109/ICCMC51019.2021.9418243.
  11. Samantaray, S., and Sahoo, A. (2020). Appraisal of Runoff Through BPNN, RNN, and RBFN in Tentulikhunti Watershed: A Case Study, 258–267. doi:10.1007/978-981-13-9920-6_26.
  12. Mailinda, I., Ruldeviyani, Y., Tanjung, F., Mikoriza T, R., Putra, R., and Fauziah A, T. (2021). Stock Price Prediction During the Pandemic Period with the SVM, BPNN, and LSTM Algorithm, 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 189–194. doi:10.1109/ISRITI54043.2021.9702865.
  13. Sadeghi, M., Mohammadinasab, E., and Momeni Isfahani, T. (2023). QSPR models for predicting the Kovats retention indices of synthetic ester derivatives based on pyrethrin essential oil, Journal of Essential Oil Research, Vol. 35, No. 6, 542–562. doi:10.1080/10412905.2023.2265376.
  14. Maulana, A., Noviandy, T. R., Idroes, R., Sasmita, N. R., Suhendra, R., and Irvanizam, I. (2020). Prediction of Kovats Retention Indices for Fragrance and Flavor using Artificial Neural Network, 2020 International Conference on Electrical Engineering and Informatics (ICELTICs), IEEE, 1–5. doi:10.1109/ICELTICs50595.2020.9315391.
  15. Babushok, V. I., Linstrom, P. J., and Zenkevich, I. G. (2011). Retention Indices for Frequently Reported Compounds of Plant Essential Oils, Journal of Physical and Chemical Reference Data, Vol. 40, No. 4, 043101. doi:10.1063/1.3653552.
  16. Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
  17. Fernández-Torras, A., Comajuncosa-Creus, A., Duran-Frigola, M., and Aloy, P. (2022). Connecting chemistry and biology through molecular descriptors, Current Opinion in Chemical Biology, Vol. 66, 102090. doi:10.1016/j.cbpa.2021.09.001.
  18. Sushko, I., Novotarskyi, S., Körner, R., Pandey, A. K., Rupp, M., Teetz, W., Brandmaier, S., Abdelaziz, A., Prokopenko, V. V., Tanchuk, V. Y., Todeschini, R., Varnek, A., Marcou, G., Ertl, P., Potemkin, V., Grishina, M., Gasteiger, J., Schwab, C., Baskin, I. I., Palyulin, V. A., Radchenko, E. V., Welsh, W. J., Kholodovych, V., Chekmarev, D., Cherkasov, A., Aires-De-Sousa, J., Zhang, Q. Y., Bender, A., Nigsch, F., Patiny, L., Williams, A., Tkachenko, V., and Tetko, I. V. (2011). Online chemical modeling environment (OCHEM): Web platform for data storage, model development and publishing of chemical information, Journal of Computer-Aided Molecular Design, Vol. 25, No. 6, 533–554. doi:10.1007/s10822-011-9440-2.
  19. Noviandy, T. R., Maulana, A., Sasmita, N. R., Suhendra, R., Irvanizam, I., Muslem, M., Idroes, G. M., Yusuf, M., Sofyan, H., Abidin, T. F., and Idroes, R. (2022). The Prediction of Kovats Retention Indices of Essential Oils at Gas Chromatography Using Genetic Algorithm-Multiple Linear Regression and Support Vector Regression, Journal of Engineering Science and Technology, Vol. 17, No. 1, 306–326.
  20. Maulana, A., Faisal, F. R., Noviandy, T. R., Rizkia, T., Idroes, G. M., Tallei, T. E., El-Shazly, M., and Idroes, R. (2023). Machine Learning Approach for Diabetes Detection Using Fine-Tuned XGBoost Algorithm, Infolitika Journal of Data Science, Vol. 1, No. 1, 1–7. doi:10.60084/ijds.v1i1.72.
  21. Idroes, R., Noviandy, T. R., Maulana, A., Suhendra, R., and Sasmita, N. R. (2023). ANFIS-Based QSRR Modelling for Kovats Retention Index Prediction in Gas Chromatography, Infolitika Journal of Data Science, Vol. 1, No. 1, 1–14. doi:10.60084/ijds.v1i1.73.
  22. Noviandy, T. R., Maulana, A., Idroes, G. M., Irvanizam, I., Subianto, M., and Idroes, R. (2023). QSAR-Based Stacked Ensemble Classifier for Hepatitis C NS5B Inhibitor Prediction, 2023 2nd International Conference on Computer System, Information Technology, and Electrical Engineering (COSITE), IEEE, 220–225. doi:10.1109/COSITE60233.2023.10250039.
  23. Noviandy, T. R., Idroes, G. M., and Hardi, I. (2024). Enhancing Loan Approval Decision-Making: An Interpretable Machine Learning Approach Using LightGBM for Digital Economy Development, Malaysian Journal of Computing (MJOC), Vol. 9, No. 1, 1734–1745. doi:10.24191/mjoc.v9i1.25691.
  24. Roy, K., and Kar, S. (2015). How to Judge Predictive Quality of Classification and Regression Based QSAR Models?, Z. Ul-Haq; J. D. B. T.-F. in C. C. Madura (Eds.), , Bentham Science Publishers, 71–120. doi:




How to Cite

Safhadi, A. A.-J., Noviandy, T. R., Irvanizam, I., Suhendra, R., Karma, T., & Idroes, R. (2024). Backpropagation Neural Network-Based Prediction of Kovats Retention Index for Essential Oil Compounds. Infolitika Journal of Data Science, 2(1), 28–33.