Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring

Authors

  • Ghazi Mauer Idroes Graduate School of Mathematics and Applied Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia; Department of Occupational Health and Safety, Faculty of Health Sciences, Universitas Abulyatama, Aceh Besar 23372, Indonesia
  • Teuku Rizky Noviandy Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Aga Maulana Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Zahriah Zahriah Department of Architecture and Urban Planning, Faculty of Engineering, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Suhendrayatna Suhendrayatna Department of Chemical Engineering, Faculty of Engineering, Universitas Syiah Kuala, Banda Aceh 23111, Indonesia
  • Eko Suhartono Department of Medical Chemistry/Biochemistry, Faculty of Medicine, Lambung Mangkurat University, Banjarbaru 70124, Indonesia
  • Khairan Khairan Department of Pharmacy, Universitas Syiah Kuala, Banda Aceh, 23111, Indonesia; Department of Chemistry, Universitas Syiah Kuala, Banda Aceh, 23111, Indonesia
  • Fitranto Kusumo Centre for Technology in Water and Wastewater, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo 2007 NSW Australia
  • Zuchra Helwani Department of Chemical Engineering, Universitas Riau, Pekanbaru 28293, Indonesia
  • Sunarti Abd Rahman Faculty of Chemical & Process Engineering Technology, Universiti Malaysia Pahang, Lebuhraya Persiaran Tun Khalil Yaakob, 26300 Gambang, Kuantan, Pahang, Malaysia

DOI:

https://doi.org/10.60084/ljes.v1i2.99

Keywords:

Air quality index, Artificial intelligence, Data analysis, Pollutant

Abstract

Urban areas worldwide grapple with environmental challenges, notably air pollution. DKI Jakarta, Indonesia's capital city, is emblematic of this struggle, where rapid urbanization contributes to increased pollutants. This study employed the CatBoost machine learning algorithm, known for its resistance to overfitting and capability to handle missing data, to predict urban air quality based on pollutant levels from 2010 to 2021. The dataset, sourced from Jakarta's air quality monitoring stations, includes pollutants such as PM10, SO2, CO, O3, and NO2. After preprocessing, we used 80% of the data for training and 20% for testing. The model displayed high accuracy (0.9781), precision (0.9722), and recall (0.9728). The feature importance chart revealed O3 (Ozone) as the top influencer of air quality predictions, followed by PM10. Our findings highlight the dominant pollutants affecting urban air quality in Jakarta, Indonesia and emphasizing the need for targeted strategies to reduce their concentrations and ensure a cleaner and healthier urban environment.

Downloads

Download data is not yet available.

References

  1. Collier, C. G. (2006). The impact of urban areas on weather, Quarterly Journal of the Royal Meteorological Society, Vol. 132, No. 614, 1–25. doi:10.1256/qj.05.199.
  2. Pateman, T. (2011). Rural and urban areas: comparing lives using rural/urban classifications, Regional Trends, Vol. 43, No. 1, 11–86. doi:10.1057/rt.2011.2.
  3. Wang, S., Gao, S., Li, S., and Feng, K. (2020). Strategizing the relation between urbanization and air pollution: Empirical evidence from global countries, Journal of Cleaner Production, Vol. 243, 118615.
  4. Murakami, A., Kurihara, S., Harashina, K., and Zain, A. M. (2017). Features of Urbanization and Changes in the Thermal Environment in Jakarta, Indonesia, Sustainable Landscape Planning in Selected Urban Regions, 61–71.
  5. Martinez, R., and Masron, I. N. (2020). Jakarta: A city of cities, Cities, Vol. 106, 102868.
  6. Idroes, G. M., Hardi, I., Nasir, M., Gunawan, E., Maulidar, P., and Maulana, A. R. R. (2023). Natural Disasters and Economic Growth in Indonesia, Ekonomikalia Journal of Economics, Vol. 1, No. 1, 33–39. doi:10.60084/eje.v1i1.55.
  7. Lu, J., Li, B., Li, H., and Al-Barakani, A. (2021). Expansion of city scale, traffic modes, traffic congestion, and air pollution, Cities, Vol. 108, 102974.
  8. Suh, H. H., Bahadori, T., Vallarino, J., and Spengler, J. D. (2000). Criteria air pollutants and toxic air pollutants., Environmental Health Perspectives, Vol. 108, No. suppl 4, 625–633.
  9. Domingo, J. L., and Rovira, J. (2020). Effects of air pollutants on the transmission and severity of respiratory viral infections, Environmental Research, Vol. 187, 109650.
  10. Noviandy, T. R., Maulana, A., Idroes, G. M., Emran, T. B., Tallei, T. E., Helwani, Z., and Idroes, R. (2023). Ensemble Machine Learning Approach for Quantitative Structure Activity Relationship Based Drug Discovery: A Review, Infolitika Journal of Data Science, Vol. 1, No. 1, 32–41. doi:10.60084/ijds.v1i1.91.
  11. Maulana, A., Noviandy, T. R., Sasmita, N. R., Paristiowati, M., Suhendra, R., Yandri, E., Satrio, J., and Idroes, R. (2023). Optimizing University Admissions: A Machine Learning Perspective, Journal of Educational Management and Learning, Vol. 1, No. 1, 1–7. doi:10.60084/jeml.v1i1.46.
  12. Noviandy, T. R., Maulana, A., Emran, T. B., Idroes, G. M., and Idroes, R. (2023). QSAR Classification of Beta-Secretase 1 Inhibitor Activity in Alzheimer’s Disease Using Ensemble Machine Learning Algorithms, Heca Journal of Applied Sciences, Vol. 1, No. 1, 1–7. doi:10.60084/hjas.v1i1.12.
  13. Maulana, A., Faisal, F. R., Noviandy, T. R., Rizkia, T., Idroes, G. M., Tallei, T. E., El-Shazly, M., and Idroes, R. (2023). Machine Learning Approach for Diabetes Detection Using Fine-Tuned XGBoost Algorithm, Infolitika Journal of Data Science, Vol. 1, No. 1, 1–7. doi:10.60084/ijds.v1i1.72.
  14. Iffaty, A., Salsabila, A., Rafiqhi, A. A., Suhendra, R., Yusuf, M., and Sasmita, N. R. (2023). Enhancing Water Quality Assessment in Indonesia Through Digital Image Processing and Machine Learning, Grimsa Journal of Science Engineering and Technology, Vol. 1, No. 1, 1–7.
  15. Mahesh, B. (2020). Machine learning algorithms-a review, International Journal of Science and Research (IJSR).[Internet], Vol. 9, No. 1, 381–386.
  16. Noviandy, T. R., Maulana, A., Idroes, G. M., Irvanizam, I., Subianto, M., and Idroes, R. (2023). QSAR-Based Stacked Ensemble Classifier for Hepatitis C NS5B Inhibitor Prediction, 2023 2nd International Conference on Computer System, Information Technology, and Electrical Engineering (COSITE), IEEE, 220–225. doi:10.1109/COSITE60233.2023.10250039.
  17. Suhendra, R., Suryadi, S., Husdayanti, N., Maulana, A., Noviandy, T. R., Sasmita, N. R., Subianto, M., Earlia, N., Niode, N. J., and Idroes, R. (2023). Evaluation of Gradient Boosted Classifier in Atopic Dermatitis Severity Score Classification, Heca Journal of Applied Sciences, Vol. 1, No. 2, 54–61. doi:10.60084/hjas.v1i2.85.
  18. Castelli, M., Clemente, F. M., Popovič, A., Silva, S., and Vanneschi, L. (2020). A Machine Learning Approach to Predict Air Quality in California, Complexity, Vol. 2020, 1–23. doi:10.1155/2020/8049504.
  19. Vu, T. V., Shi, Z., Cheng, J., Zhang, Q., He, K., Wang, S., and Harrison, R. M. (2019). Assessing the impact of clean air action on air quality trends in Beijing using a machine learning technique, Atmospheric Chemistry and Physics, Vol. 19, No. 17, 11303–11314. doi:10.5194/acp-19-11303-2019.
  20. Masih, A. (2019). Machine learning algorithms in air quality modeling, Global Journal of Environmental Science and Management, Vol. 5, No. 4, 515–534. doi:10.22034/GJESM.2019.04.10.
  21. Gupta, N. S., Mohta, Y., Heda, K., Armaan, R., Valarmathi, B., and Arulkumaran, G. (2023). Prediction of Air Quality Index Using Machine Learning Techniques: A Comparative Analysis, Journal of Environmental and Public Health, Vol. 2023, 1–26. doi:10.1155/2023/4916267.
  22. Dorogush, A. V., Ershov, V., and Gulin, A. (2018). CatBoost: gradient boosting with categorical features support, ArXiv Preprint ArXiv:1810.11363.
  23. Jabeur, S. Ben, Gharib, C., Mefteh-Wali, S., and Arfi, W. Ben. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction, Technological Forecasting and Social Change, Vol. 166, 120658. doi:10.1016/j.techfore.2021.120658.
  24. Dhananjay, B., and Sivaraman, J. (2021). Analysis and classification of heart rate using CatBoost feature ranking model, Biomedical Signal Processing and Control, Vol. 68, 102610. doi:10.1016/j.bspc.2021.102610.
  25. Al-Sarem, M., Saeed, F., Boulila, W., Emara, A. H., Al-Mohaimeed, M., and Errais, M. (2021). Feature Selection and Classification Using CatBoost Method for Improving the Performance of Predicting Parkinson’s Disease, 189–199. doi:10.1007/978-981-15-6048-4_17.
  26. Jakarta Open Data. (2021). Indeks Standar Pencemaran Udara (ISPA), from https://data.jakarta.go.id/dataset/?q=Indeks+Standar+Pencemaran+Udara+&sort=1.
  27. Hamami, F., and Dahlan, I. A. (2022). Air Quality Classification in Urban Environment using Machine Learning Approach, IOP Conference Series: Earth and Environmental Science, Vol. 986, No. 1, 012004. doi:10.1088/1755-1315/986/1/012004.
  28. Joseph, V. R. (2022). Optimal ratio for data splitting, Statistical Analysis and Data Mining: The ASA Data Science Journal, Vol. 15, No. 4, 531–538. doi:10.1002/sam.11583.
  29. Idroes, R., Noviandy, T. R., Maulana, A., Suhendra, R., Sasmita, N. R., Muslem, M., Idroes, G. M., Kemala, P., and Irvanizam, I. (2021). Application of Genetic Algorithm-Multiple Linear Regression and Artificial Neural Network Determinations for Prediction of Kovats Retention Index, International Review on Modelling and Simulations (IREMOS), Vol. 14, No. 2, 137. doi:10.15866/iremos.v14i2.20460.
  30. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., and Gulin, A. (2018). CatBoost: unbiased boosting with categorical features, Advances in Neural Information Processing Systems, Vol. 31.
  31. Noviandy, T. R., Maulana, A., Idroes, G. M., Maulydia, N. B., Patwekar, M., Suhendra, R., and Idroes, R. (2023). Integrating Genetic Algorithm and LightGBM for QSAR Modeling of Acetylcholinesterase Inhibitors in Alzheimer’s Disease Drug Discovery, Malacca Pharmaceutics, Vol. 1, No. 2, 48–54. doi:10.60084/mp.v1i2.60.
  32. Maulana, A., Noviandy, T. R., Idroes, R., Sasmita, N. R., Suhendra, R., and Irvanizam, I. (2020). Prediction of Kovats Retention Indices for Fragrance and Flavor using Artificial Neural Network, Proceedings of the International Conference on Electrical Engineering and Informatics (Vol. 2020-Octob). doi:10.1109/ICELTICs50595.2020.9315391.
  33. Carvalho, D. V, Pereira, E. M., and Cardoso, J. S. (2019). Machine learning interpretability: A survey on methods and metrics, Electronics, Vol. 8, No. 8, 832.
  34. Noviandy, T. R., Maulana, A., Idroes, G. M., Suhendra, R., Adam, M., Rusyana, A., and Sofyan, H. (2023). Deep Learning-Based Bitcoin Price Forecasting Using Neural Prophet, Ekonomikalia Journal of Economics, Vol. 1, No. 1, 19–25. doi:10.60084/eje.v1i1.51.
  35. Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., and Idroes, R. (2023). Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques, Indatu Journal of Management and Accounting, Vol. 1, No. 1, 29–35. doi:10.60084/ijma.v1i1.78.

Downloads

Published

2023-11-06

How to Cite

Idroes, G. M., Noviandy, T. R., Maulana, A., Zahriah, Z., Suhendrayatna, S., Suhartono, E., Khairan, K., Kusumo, F., Helwani, Z., & Abd Rahman, S. (2023). Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring. Leuser Journal of Environmental Studies, 1(2), 62–68. https://doi.org/10.60084/ljes.v1i2.99