J-Icon : Jurnal Komputer dan Informatika https://ejurnal.undana.ac.id/index.php/jicon <p style="text-align: right;"><strong>ISSN:&nbsp;<a href="https://issn.brin.go.id/terbit?search=26544091" target="_blank" rel="noopener">2337-7631(Printed)</a></strong></p> <p style="text-align: right;"><strong>ISSN: <a href="https://issn.brin.go.id/terbit?search=26544091" target="_blank" rel="noopener">2654-4091 (Online)</a></strong></p> <p style="text-align: justify; line-height: 2em;">J-Icon : Jurnal Komputer dan Informatika <span class="VIiyi" lang="en"><span class="JLqJ4b ChMk0b" data-language-for-alternatives="en" data-language-to-translate-into="auto" data-phrase-index="0"> is published twice a year (March and October) by the Department of Computer Science, Faculty of Science and Engineering, Undana.</span> <span class="JLqJ4b ChMk0b" data-language-for-alternatives="en" data-language-to-translate-into="auto" data-phrase-index="1">This journal publishes unpublished research articles in the field of Computer Science.</span> <span class="JLqJ4b ChMk0b" data-language-for-alternatives="en" data-language-to-translate-into="auto" data-phrase-index="2">Contribution requirements are listed on the inside cover of each issue number.</span></span></p> <p style="text-align: justify; line-height: 2em;">J-Icon : Jurnal Komputer dan Informatika has national accreditation <a title="Peringkat SINTA 4" href="https://sinta.kemdikbud.go.id/journals/profile/5852"><strong>Sinta 4</strong></a> based on the Decree of the Director General of Higher Education, Research and Technology, Ministry of Education and Culture, Research and Technology of Indonesia with Number 225/E/KPT/2022.</p> Universitas Nusa Cendana en-US J-Icon : Jurnal Komputer dan Informatika 2337-7631 <p>The author submitting the manuscript must understand and agree that if accepted for publication,&nbsp; authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a&nbsp;<a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution (CC-BY) 4.0 License</a>&nbsp;that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.</p> <p>&nbsp;</p> CONSTRUCTING A DATASET FOR INFECTIOUS DISEASE PREDICTION AND SPATIAL CLUSTER ANALYSIS https://ejurnal.undana.ac.id/index.php/jicon/article/view/23729 <p>This study presents a structured methodology for constructing a custom dataset derived from patient visit records collected over a three-year period (January 1, 2019 – December 31, 2021) at a healthcare facility in Bandung Regency, Indonesia. The raw medical records were systematically transformed into a machine learning–ready dataset, involving feature extraction, labeling, and geospatial enrichment. Key transformations included the removal of personally identifiable information, the standardization of clinical symptoms into structured variables, and the assignment of diagnostic and referral labels in accordance with ICD-10 classification standards.</p> <p>&nbsp;</p> <p>Additionally, the dataset was enhanced with spatial coordinates—longitude and latitude—to enable geospatial analyses such as transmission radius estimation, proximity clustering, and identification of regional case densities. This structure supports both supervised and unsupervised learning methods, including classification, referral prediction, and spatial cluster detection.</p> <p>&nbsp;</p> <p>The resulting dataset has been successfully utilized in several advanced experiments: disease classification, referral status prediction, feature importance interpretation using SHAP and LIME, geospatial clustering, and synthetic data generation to mitigate challenges related to privacy and limited data availability. The methodology outlined in this study is expected to support future research in healthcare analytics and contribute to the development of decision support systems and public health policy planning tools.</p> <p>&nbsp;</p> <p>&nbsp;</p> <p>&nbsp;</p> Husni Iskandar Pohan ##submission.copyrightStatement## http://creativecommons.org/licenses/by/4.0 2025-08-12 2025-08-12 13 2 60 67 10.35508/jicon.v13i2.23729 A COMPARATIVE STUDY OF SUPERVISED FEATURE SELECTION METHODS FOR PREDICTING UANG KULIAH TUNGGAL (UKT) GROUPS https://ejurnal.undana.ac.id/index.php/jicon/article/view/23893 <p>The manual classification of Uang Kuliah Tunggal (UKT) groups at Indonesian public universities <br>is laborious, subjective, and error-prone, especially given the explosion of socio-economic data captured <br>via online admission portals. In this study, we evaluate five feature selection techniques Chi-Square filter, <br>Random Forest importance, Recursive Feature Elimination, LASSO embedded selection, and Exploratory <br>Factor Analysis on a dataset of 9,369 applicants described by 53 socio-economic variables. Six classifiers <br>(Decision Tree, Random Forest, SVM-RBF, K-Nearest Neighbor, and Naïve Bayes) were tuned via <br>stratified 5-fold cross-validation within an 80:20 train-test split. Performance was measured by accuracy, <br>macro-F1, and training time, and differences in weighted-average accuracy across feature-selection <br>scenarios were assessed using the Friedman test (χ² = 15.06, p = 0.010). Results show that reducing to 13 <br>features via LASSO (weighted-average accuracy 0.730) or Chi-Square (0.678) significantly outperforms <br>both the full feature baseline (0.624) and the EFA baseline (0.303), while cutting computational costs by <br>over 40%. We conclude that supervised feature selection particularly LASSO and Chi-Square enables <br>simpler, faster, and more transparent UKT prediction without sacrificing accuracy. The novelty of this study <br>lies in comparing five feature-selection methods within a standardized preprocessing pipeline on real UKT <br>data from UNESA, resulting in a 13-feature subset aligned with the current UKT policy. This finding is <br>ready to be integrated into an automated UKT verification system to enhance decision accuracy and <br>efficiency.</p> Windy Chikita Cornia Putri Wiyli Yustanti Ervin Yohannes ##submission.copyrightStatement## http://creativecommons.org/licenses/by/4.0 2025-08-31 2025-08-31 13 2 68 76 10.35508/jicon.v13i2.23893 Hyperparameter Optimization in Machine Learning Models on Sky Survey Data Classification https://ejurnal.undana.ac.id/index.php/jicon/article/view/18493 <p><em>Discovering the optimal model in today's popularity of various machine learning applications remains an essential challenge. Besides data dependency, the performance of classification models is also affected by deciding on suitable algorithm with optimal hyperparameter settings. This study conducted a hyperparameter optimization process and compared the accuracy results by applying various classification models to the observation dataset. This study obtains data from the Sloan Digital Sky Survey Data Release 18 (SDSS-DR18) and Sloan Extension for Galactic Understanding and Exploration (SEGUE-IV). The SDSS-DR18 and SEGUE-IV provide observational data of space objects, such as stellar spectra with corresponding positions and magnitudes of galaxies or stars. The SDSS-DR18 dataset contains magnitude and redshift data of celestial objects with target features of stars, Quasi Stellar Objects (QSOs), and galaxies. The SEGUE-IV dataset contains equivalent-width parameters, inline indices, and other features to the radial velocity of the corresponding star spectrum. This study utilized several machine learning models, such as k-Nearest Neighbor (KNN), Gaussian-Naive Bayes, eXtreme Gradient Boosting (XGBoost), Random Forest, Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). This study utilized Bayesian, Grid, and Random-based approaches to find the optimal hyperparameters to maximize the performance of the classification model. This study proved that some classification models have improved accuracy scores through the Bayesian-based hyperparameter optimization settings. This study discovers the XGBoost model shows the highest classification results after hyperparameters optimization compared to other models for both datasets with an average accuracy of 99.10% and 95.11%, respectively.</em></p> Efraim Kurniawan Dairo Kette ##submission.copyrightStatement## http://creativecommons.org/licenses/by/4.0 2025-09-26 2025-09-26 13 2 77 84 10.35508/jicon.v13i2.18493