https://heca-analitika.com/ijds/issue/feedInfolitika Journal of Data Science2025-12-02T22:50:13+07:00Editorial Officeeditorial-office@heca-analitika.comOpen Journal Systems<p><strong>Infolitika Journal of Data Science </strong>is a peer-reviewed international scientific publication dedicated to showcasing exceptional original research articles and review papers in the field of data science. Infolitika Journal of Data Science centers its focus on fostering interdisciplinary research endeavors that bridge scientific and technological advancements with real-world applications and their societal implications. The journal maintains a biannual publication schedule (May and November).</p> <p>Infolitika Journal of Data Science cordially invites submissions from a diverse array of researchers, practitioners, and scholars worldwide. The journal enthusiastically encourages the submission of pioneering research that unveils novel insights and propels the data science field forward. With an unwavering commitment to excellence, pertinence, and influence, Infolitika Journal of Data Science is devoted to disseminating articles that not only uphold the highest quality standards but also facilitate knowledge dissemination and collaboration among the global research community.</p>https://heca-analitika.com/ijds/article/view/347Comparison of Spatial Interpolation Methods: Inverse Distance Weighted and Kriging for Earthquake Intensity Mapping in Aceh, Indonesia2025-12-02T22:50:13+07:00Latifah Rahayulatifah.rahayu@usk.ac.idCut Chairilla Yolanda Utamichairilla@mhs.usk.ac.idRahmatul Fauzirahmatulfz@mhs.usk.ac.idNovi Reandy Sasmitanovireandys@usk.ac.id<p>Aceh Province, located in the Sumatra megathrust zone of Indonesia, is one of the most seismically active regions in Southeast Asia. Understanding the spatial distribution of earthquake magnitudes is essential for disaster mitigation and risk management. This study compares two spatial interpolation methods Inverse Distance Weighted (IDW) and Kriging to determine the most accurate approach for mapping earthquake intensity in Aceh Province. A total of 2,255 earthquake events with magnitudes of 2.5 M and above, recorded between 1990 and 2024 by the United States Geological Survey (USGS), were analyzed. IDW was tested using five power parameters (p = 1–5), while Kriging applied three semivariogram models (spherical, exponential, and Gaussian). The interpolation accuracy was assessed through Root Mean Square Error (RMSE), Mean Square Error (MSE), and Mean Absolute Percentage Error (MAPE). Results indicated that Kriging with the exponential semivariogram achieved the highest accuracy, with RMSE = 0.0848, MSE = 0.0072, and MAPE = 1.14%, outperforming IDW (RMSE = 0.2288, MSE = 0.0523, MAPE = 1.24%). The Kriging model effectively represented the gradual spatial decay of seismic energy, identifying Aceh Singkil and northern Simeulue as the most earthquake-prone zones, consistent with regional tectonic patterns. These findings confirm that incorporating spatial autocorrelation enhances interpolation accuracy and geophysical interpretation. The study establishes Kriging as a reliable tool for seismic hazard mapping and provides valuable insights for disaster preparedness, infrastructure planning, and future geostatistical applications in earthquake risk assessment.</p>2025-11-28T00:00:00+07:00Copyright (c) 2025 Latifah Rahayu, Cut Chairilla Yolanda Utami, Rahmatul Fauzi, Novi Reandy Sasmitahttps://heca-analitika.com/ijds/article/view/364An Interpretable Machine Learning Framework for Predicting Advanced Tumor Stages2025-12-02T22:50:12+07:00Teuku Rizky Noviandyrizky_si@abulyatama.ac.idMohsina Patwekarmohsina.patwekar@gmail.comFaheem Patwekarifaheemp@gmail.comRinaldi Idroesrinaldi.idroes@usk.ac.id<p>Accurate identification of advanced tumor stages is essential for timely clinical decision-making and personalized treatment planning. This study proposes an explainable ensemble learning framework for predicting advanced tumor stage using a dataset containing 10,000 samples with 18 clinical and radiological features. Four machine learning models, namely Logistic Regression, Naïve Bayes, AdaBoost, and LightGBM, were evaluated using stratified train–test splits along with standard performance metrics. LightGBM achieved the highest performance, with an accuracy of 86.05% and an F1-score of 76.61%, outperforming linear and probabilistic classifiers. ROC–AUC and precision–recall analyses further confirmed the superior discriminative ability of ensemble methods. SHAP explainability techniques highlighted mitotic count, Ki-67 index, enhancement, and necrosis as the most influential predictors of advanced stage. The proposed framework demonstrates strong predictive capability and provides clinically interpretable insights, underscoring its potential as a decision-support tool in oncological diagnostics. Future work will involve external validation and integration of additional multimodal data to enhance generalizability.</p>2025-11-29T00:00:00+07:00Copyright (c) 2025 Teuku Rizky Noviandy, Mohsina Patwekar, Faheem Patwekar, Rinaldi Idroeshttps://heca-analitika.com/ijds/article/view/361Enhanced Thyroid Disorder Classification Through XGBoost-Based Machine Learning Techniques2025-12-02T22:50:06+07:00Aga Maulanaagamaulana@usk.ac.id<p>Thyroid disorders are common endocrine conditions whose diagnosis often requires integrating multiple clinical and laboratory indicators. This study proposes a machine learning framework for multiclass classification of thyroid diseases using XGBoost combined with an automated preprocessing and feature-engineering pipeline. A dataset of 9,167 patient records and 30 clinical and biochemical features was processed using a structured pipeline that included imputation, encoding, scaling, and hyperparameter optimization with RandomizedSearchCV and GridSearchCV. The optimized XGBoost model achieved 95.20% test accuracy, a high weighted F1-score (0.94), and consistent cross-validated performance. Classification results showed excellent discrimination for major thyroid conditions and reliable identification of healthy individuals. Feature importance analysis revealed that TBG-related measurements, thyroxine therapy status, and key hormone indices (TSH, TT4, FTI) were the most influential predictors. Overall, the findings demonstrate that the proposed XGBoost-based framework provides accurate and robust support for multiclass thyroid disease diagnosis and can serve as a practical foundation for clinical decision-support applications.</p>2025-11-30T00:00:00+07:00Copyright (c) 2025 Aga Maulanahttps://heca-analitika.com/ijds/article/view/359A Convolutional Neural Network Model for Mushroom Toxicity Recognition2025-12-02T22:50:10+07:00Irvanizam Irvanizamirvanizam.zamanhuri@usk.ac.idMuhammad Subiantosubianto@usk.ac.idMuhammad Salsabila Jamiljamilsalsabila@gmail.com<p>Mushroom poisoning remains a public health concern, often caused by misidentifying toxic species that visually resemble edible ones. This study investigates the feasibility of using a Convolutional Neural Network (CNN) to classify five mushroom species, <em>Amanita caesarea</em>, <em>Amanita phalloides</em>, <em>Cantharellus cibarius</em>, <em>Omphalotus olearius</em>, and <em>Volvariella volvacea </em>into toxic and non-toxic categories based on image data. A dataset of 137 images was collected and preprocessed through resizing, normalization, and data augmentation. A modified AlexNet-based CNN was trained and evaluated using accuracy, precision, recall, and F1-score. The best-performing model achieved a validation accuracy of 0.40, indicating limited discriminative capability. These findings highlight that the dataset size is insufficient for training a CNN from scratch and that the model cannot reliably distinguish species with subtle morphological differences. The study concludes that larger datasets, improved image quality, and transfer learning approaches are essential for achieving practical and deployable mushroom classification performance.</p>2025-11-30T00:00:00+07:00Copyright (c) 2025 Irvanizam Irvanizam, Muhammad Subianto, Muhammad Salsabila Jamilhttps://heca-analitika.com/ijds/article/view/360Assessing the Performance of Ensemble and Regularized Models for Daily Rainfall Forecasting in Singapore2025-12-02T22:50:08+07:00Musliadi Musliadimusliadi6@mhs.usk.ac.idMuhammad Zulkarnainizul22@mhs.usk.ac.idAsalul Musaffaasalul@mhs.usk.ac.idYolanda Yolandamr.yolanda24@gmail.com<p>This study benchmarks ensemble and regularized machine learning models for daily rainfall forecasting using meteorological data from forty-four observation stations across Singapore. The country’s highly variable tropical climate and frequent short-duration rainfall events pose major challenges for urban flood mitigation and operational forecasting. To address this, three algorithms—Lasso Regression, XGBoost Regression, and Gradient Boosting Regression—were developed and evaluated through a systematic comparison of predictive performance. Each model was trained using data from 1980–2023 and validated on independent observations from 2024–2025. The input variables included sub-hourly rainfall intensity, temperature, and wind-related parameters processed through a standardized data-cleaning and imputation pipeline. Results show that XGBoost achieved the most consistent and accurate predictions, with superior performance under both normal and heavy rainfall conditions. Statistical tests confirmed that the improvement was significant compared to Lasso and Gradient Boosting. These findings demonstrate the effectiveness of ensemble-based approaches for enhancing the reliability of data-driven rainfall forecasting in tropical urban environments and support their integration into early warning and hydrological risk management systems.</p>2025-11-30T00:00:00+07:00Copyright (c) 2025 Musliadi Musliadi, Muhammad Zulkarnaini, Asalul Musaffa, Yolanda Yolanda