Main Article Content

Derwin Rony Sina
Nelci Dessy Rumlaklak


Comparative studies of machine learning are carried out with the aim of determining the best method base based on the ability to predict with true data. The study carried out on the labor dataset aims to extract information on the choice of agency employees to exit or not. The method used in the comparative study is K-Nearest Neighbors (KNN) from the basis of similarity, Naïve Bayes (NB) from the probability base, and C4.5 from the basis of the decision tree. Application design and construction is done by receiving input labor data, the dataset is divided into training data and test data, training data for training and models while the test data is used when classifying by model. The classification process is carried out using supply training scenarios and cross validation of 14,999 data. The initial hypothesis C4.5 is the best method with an accuracy measure. Proof of the initial hypothesis will be true if the best accuracy majority is owned by the C4.5 method with supply trainning scenarios and cross validation. The results of the classification data analysis found that the C4.5 accuracy was superior in each parameter of the inventory training scenario data distribution and the k-fold parameter was 3. 5. 7, and 9 of the cross validation scenario so that the best method of non-active labor classification was C4.5.


Download data is not yet available.

Article Details

How to Cite
Sina, D., KUSRORONG, N. K., & Rumlaklak, N. (2019). KAJIAN MACHINE LEARNING DENGAN KOMPARASI KLASIFIKASI PREDIKSI DATASET TENAGA KERJA NON-AKTIF. J-Icon : Jurnal Komputer Dan Informatika, 7(1), 37-49. https://doi.org/10.35508/jicon.v7i1.880


Arnold, K., Gosling, J., & Holmes, D. (2005). THE Java Programming Language, Fourth Edition. Addison Wesley Professional.
Fitri, S. (2014). Perbandingan Kinerja Algoritma Klasifikasi Naïve Bayesian, Lazy-IBK , Zero-R, dan Decision Tree-J48 (2014).
Good, I. J. (1950). Probability and the Weighing of Evidence. University of Wisconsin - Madison: Charles Griffin.
Hamakonda, T. P. (1991). Pengantar klasifikasi persepuluhan dewey. Jakarta: BPK Gunung Mulia.
Hastuti, K. (2012). Analisis Komparisasi Algoritma Klasifikasi Data Mining Untuk Prediksi Mahasiswa Non-aktif.
Kotsiantis, S. (2007). Supervised machine learning: a review of classification techniques.
Michie, D., Spiegelhalter, D., & Taylor, C. (2009). Machine Learning: Neural and Statistical Classification. Cambridge : project StatLog.
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. Massachutes: MIT press.
Myler, H. R. (1998). Fundamentals of Engineering Programming with C and Fortran. Cambridge: University Press.
Quinlan, J. R. (1979). Induction over Large Data Bases. San Francisco: STANFORD UNIV CALIF DEPT OF COMPUTER SCIENCE.
Sanu, A. N. (2016). Studi Perbandingan Performansi Multinomial Naïve Bayes Dan Transformed Complement Naïve Bayes Saat Klasifikasi Teks Pada Dataset Yang Tidak Seimbang.
Sartika, D., & Sensuse, D. I. (2017). Perbandingan Algoritma Klasifikasi Naïve Bayes, Nearest Neighbour , dan Decision Tree pada Studi Kasus Pengambilan Keputusan Pemilihan Pola Pakaian.
Słowiński, R. (1989). Rough classification in incomplete information systems. Mathematical and Computer Modelling, 1347-1357.

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.