A Comparative Study of Multiclass Classification Method for Mapping COVID-19 Risk Zone in Java Island

  • Jesica Nauli Br. Siringo Ringo(1*)
    Politeknik Statistik STIS
  • Wahyu Joko Mursalin(2)
    5Politeknik Statistik STIS
  • Nisrina Citra Nurfadilah(3)
    Politeknik Statistik STIS
  • Dwiky Rachmat Ramadhan(4)
    Politeknik Statistik STIS
  • Wa Ode Zuhayeni Madjida(5)
    Politeknik Statistik STIS
  • (*) Corresponding Author
Keywords: risk zone, classification, data mining, evaluation measure

Abstract

Various attempts are needed to control the increment of COVID-19 cases in Indonesia, especially Java Island. One of the effective attempt to do this is through the preventive act by providing news about a region. Indonesia, through Satgas Penanganan COVID-19, has built a risk zone of district/city as a warning system for the public and the substance of policy making for government in region level. The risk zone is built by three kinds of indicator using a conventional technique named score weighting. By considering the importance of the risk zone for policy making in the government, this study aims to build a risk zone classification model for districts / cities in Java using several data mining classification techniques and determine the best classification model based on evaluation results. This study uses several classification technique on the purpose of comparation. These techniques are naive Bayes, decision tree, k-nearest-neighbor, and neural network. Before entering the modeling stage, data is being adjustedat the preprocessing stage where missing value and imbalanced data problems are identifies. These problems is being overcome by doing data imputation and oversampling techniques. The result of this study indicates that k-nearest-neighbor is the best model compared to other three models. This result is based on the evaluation measures of the four models where the k-NN model has the highest accuracy value, the macro average value for sensitivity, specivicity, and F1-Measure compared to other models.

Downloads

Download data is not yet available.

References

World Health Organisation, ‘WHO Coronavirus Disease (COVID-19) Dashboard’. https://covid19.who.int/table (accessed Nov. 20, 2020).

Satuan Tugas Penanganan COVID 19, ‘Peta Sebaran | Covid19.go.id’. https://covid19.go.id/peta-sebaran (accessed Nov. 20, 2020).

W. Wiguna and D. Riana, ‘Diagnosis of Coronavirus disease 2019 (Covid-19) surveillance using C4. 5 algorithm’, Jurnal PILAR Nusa Mandiri, vol. 16, no. 1, pp. 71–80, 2020.

A. P. Windarto, U. Indriani, M. R. Raharjo, and L. S. Dewi, ‘Bagian 1: Kombinasi Metode Klastering dan Klasifikasi (Kasus Pandemi Covid-19 di Indonesia)’, Jurnal Media Informatika Budidarma, vol. 4, no. 3, pp. 855–862, 2020.

A. P. Windarto, J. Naam, Y. Yuhandri, A. Wanto, and M. Mesran, ‘Bagian 2: Model Arsitektur Neural Network Dengan Kombinasi K-Medoids dan Backpropagation pada kasus Pandemi Covid-19 di Indonesia’, Jurnal Media Informatika Budidarma, vol. 4, no. 4, pp. 1175–1180, 2020.

Kemenkes Indonesia, ‘Kemenkes Siap Sosialisasikan Perubahan Istilah ODP, PDP dan OTG ke Seluruh Dinas Kesehatan - Sehat Negeriku’. https://sehatnegeriku.kemkes.go.id/baca/umum/20200714/3334463/kemenkes-siap-sosialisasikan-perubahan-istilah-odp-pdp-dan-otg-seluruh-dinas-kesehatan/ (accessed Nov. 20, 2020).

M. Kamber and J. Pei, Data Mining. Morgan kaufmann, 2006.

C. M. Rahman, M. Kabir, A. Hossain, and K. Dahal, ‘Enhanced classification accuracy on naive bayes data mining models’, 2011.

T. Hendrawati, ‘Kajian Metode Imputasi dalam Menangani Missing Data’, 2015.

M. J. Azur, E. A. Stuart, C. Frangakis, and P. J. Leaf, ‘Multiple imputation by chained equations: what is it and how does it work?’, International journal of methods in psychiatric research, vol. 20, no. 1, pp. 40–49, 2011.

T. E. Bodner, ‘What improves with increased missing data imputations?’, Structural Equation Modeling: A Multidisciplinary Journal, vol. 15, no. 4, pp. 651–675, 2008.

C. Jian, J. Gao, and Y. Ao, ‘A new sampling method for classifying imbalanced data based on support vector machine ensemble’, Neurocomputing, vol. 193, pp. 115–122, 2016.

B. Jeong et al., ‘Comparison between statistical models and machine learning methods on classification for highly imbalanced multiclass kidney data’, Diagnostics, vol. 10, no. 6, p. 415, 2020.

PlumX Metrics

Published
2021-04-03
How to Cite
[1]
J. Ringo, W. Mursalin, N. Nurfadilah, D. Ramadhan, and W. Madjida, “A Comparative Study of Multiclass Classification Method for Mapping COVID-19 Risk Zone in Java Island”, jicon, vol. 9, no. 1, pp. 98-107, Apr. 2021.
Section
Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.