KAJIAN MACHINE LEARNING DENGAN KOMPARASI KLASIFIKASI PREDIKSI DATASET TENAGA KERJA NON-AKTIF
Abstract
Comparative studies of machine learning are carried out with the aim of determining the best method base based on the ability to predict with true data. The study carried out on the labor dataset aims to extract information on the choice of agency employees to exit or not. The method used in the comparative study is K-Nearest Neighbors (KNN) from the basis of similarity, Naïve Bayes (NB) from the probability base, and C4.5 from the basis of the decision tree. Application design and construction is done by receiving input labor data, the dataset is divided into training data and test data, training data for training and models while the test data is used when classifying by model. The classification process is carried out using supply training scenarios and cross validation of 14,999 data. The initial hypothesis C4.5 is the best method with an accuracy measure. Proof of the initial hypothesis will be true if the best accuracy majority is owned by the C4.5 method with supply trainning scenarios and cross validation. The results of the classification data analysis found that the C4.5 accuracy was superior in each parameter of the inventory training scenario data distribution and the k-fold parameter was 3. 5. 7, and 9 of the cross validation scenario so that the best method of non-active labor classification was C4.5.
Downloads
References
Fitri, S. (2014). Perbandingan Kinerja Algoritma Klasifikasi Naïve Bayesian, Lazy-IBK , Zero-R, dan Decision Tree-J48 (2014).
Good, I. J. (1950). Probability and the Weighing of Evidence. University of Wisconsin - Madison: Charles Griffin.
Hamakonda, T. P. (1991). Pengantar klasifikasi persepuluhan dewey. Jakarta: BPK Gunung Mulia.
Hastuti, K. (2012). Analisis Komparisasi Algoritma Klasifikasi Data Mining Untuk Prediksi Mahasiswa Non-aktif.
Kotsiantis, S. (2007). Supervised machine learning: a review of classification techniques.
Michie, D., Spiegelhalter, D., & Taylor, C. (2009). Machine Learning: Neural and Statistical Classification. Cambridge : project StatLog.
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. Massachutes: MIT press.
Myler, H. R. (1998). Fundamentals of Engineering Programming with C and Fortran. Cambridge: University Press.
Quinlan, J. R. (1979). Induction over Large Data Bases. San Francisco: STANFORD UNIV CALIF DEPT OF COMPUTER SCIENCE.
Sanu, A. N. (2016). Studi Perbandingan Performansi Multinomial Naïve Bayes Dan Transformed Complement Naïve Bayes Saat Klasifikasi Teks Pada Dataset Yang Tidak Seimbang.
Sartika, D., & Sensuse, D. I. (2017). Perbandingan Algoritma Klasifikasi Naïve Bayes, Nearest Neighbour , dan Decision Tree pada Studi Kasus Pengambilan Keputusan Pemilihan Pola Pakaian.
Słowiński, R. (1989). Rough classification in incomplete information systems. Mathematical and Computer Modelling, 1347-1357.
Copyright (c) 2019 Jurnal Komputer dan Informatika
This work is licensed under a Creative Commons Attribution 4.0 International License.
The author submitting the manuscript must understand and agree that if accepted for publication, authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal.