NAZIEF-ADRIANI STEMMER DENGAN IMBUHAN TAK BAKU PADA NORMALISASI BAHASA PERCAKAPAN DI MEDIA SOSIAL

  • Katarina N. Lakonawa(1)
    Universitas Nusa Cendana
  • Sebastianus A. S. Mola(2*)
    Universitas Nusa Cendana
  • Adriana Fanggidae(3)
    Universitas Nusa Cendana
  • (*) Corresponding Author
Keywords: non-standard word, non-standard affixes, Nazief-Adriani stemmer, Needleman-Wunsch string matching

Abstract

The use of non-standard language is increasingly prevalent in communication on social media. The use of indefinite language is not limited to sentences, clauses, or phrases but also word usage. In this study, the nonstandard word (NSW) will be normalized to the Indonesian standard word (SW). The Nazief-Adriani stemmer (NAS) method was developed into a nonstandard stemmer (NSS) by increasing its ability to detect non-standard additives. The Needleman-Wunsch similarity algorithm is used to weight the matches. The test results with the Mean Reciprocal Rank (MRR) of 3,438 NSW found that the use of NSS with the number of queries = 9 (Q = 9) had the highest of 79.26% with an average of 50.48%. Meanwhile, MRR testing using NAS with Q = 9 got the highest result of 72.87% and an average of 47.23%. Of the two MRR tests carried out, there were 3 letters that had the highest stemming results, both in tests using NAS and using NSS, namely the initial letters r, f and j. The most significant increase in MRR value occurs in the initial letters 'd', 'n' and 't' which are the initial letters of some non-standard affixes.

Downloads

Download data is not yet available.

References

L. Agusta, ‘Perbandingan algoritma stemming Porter dengan algoritma Nazief & Adriani untuk stemming dokumen teks bahasa indonesia’, Konferensi Nasional Sistem dan Informatika, vol. 2009, pp. 196–201, 2009.

D. Wahyudi, T. Susyanto, and D. Nugroho, ‘Implementasi dan analisis algoritma stemming nazief & adriani dan porter pada dokumen berbahasa indonesia’, Jurnal Ilmiah SINUS, vol. 15, no. 2, Art. no. 2, 2017.

M. W. Sardjono, M. Cahyanti, M. Mujahidin, and R. Arianty, ‘Pendeteksi Kesamaan Kata untuk Judul Penulisan Berbahasa Indonesia Menggunakan Algoritma Stemming Nazief-Adriani’, Sebatik, vol. 22, no. 2, Art. no. 2, 2018.

M. A. Saragih, ‘Implementasi Algoritma Brute Force dalam Pecncocokan Teks Font Italic Untuk Kata Berbahasa Inggris pada Dokumen Microsoft Office Word’, Pelita Informatika Budi Darma, vol. 4, pp. 84–86, 2013.

M. R. F. Zen, S. W. Putri, and M. F. Rasyid, Penerapan Algoritma Needleman-Wunsch sebagai Salah Satu Implementasi Program Dinamis pada Pensejajaran DNA dan Protein. Laboratorium Ilmu dan Rekayasa Komputasi, Program Studi Teknik Informatika, 2006.

M. A. Malendes and H. Bunyamin, ‘Analisa Perbandingan dan Implementasi Algoritma DNA Pairwise Sequence Alignment Needleman-Wunsch dan Lempel-Ziv’, Jurnal Teknik Informatika dan Sistem Informasi, vol. 3, no. 1, Art. no. 1, 2017.

A. M. Barik, R. Mahendra, and M. Adriani, ‘Normalization of Indonesian-English Code-Mixed Twitter Data’, in Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), 2019, pp. 417–424.

D. Gunawan, Z. Saniyah, and A. Hizriadi, ‘Normalization of Abbreviation and Acronym on Microtext in Bahasa Indonesia by Using Dictionary-Based and Longest Common Subsequence (LCS)’, Procedia Computer Science, vol. 161, pp. 553–559, 2019.

S. A. Ansari, U. Zafar, and A. Karim, ‘Improving text normalization by optimizing nearest neighbor matching’, arXiv preprint arXiv:1712.09518, 2017.

J. Porta and J.-L. Sancho, ‘Word Normalization in Twitter Using Finite-state Transducers.’, Tweet-Norm@ SEPLN, vol. 1086, pp. 49–53, 2013.

N. A. Salsabila, ‘nasalsabila/kamus-alay’, Aug. 19, 2020. https://github.com/nasalsabila/kamus-alay (accessed Oct. 06, 2020).

A. R. Dewi, ‘Penerapan Algoritma Needleman-Wunsch untuk Mengidentifikasi Mutasi pada Sekuen DNA Virus Korona-Application Of Needleman-Wunsch Algorithm To Identify Mutations In Corona Virus DNA Sequences’, PhD Thesis, Institut Teknologi Sepuluh Nopember, 2018.

R. Sunartio, H. N. Palit, and A. Gunawan, ‘Hotel Recommender System Menggunakan Metode Pendekatan Graph pada Dataset Trivago’, Jurnal Infra, vol. 8, no. 1, Art. no. 1, 2020.

PlumX Metrics

Published
2021-03-24
How to Cite
[1]
K. Lakonawa, S. Mola, and A. Fanggidae, “NAZIEF-ADRIANI STEMMER DENGAN IMBUHAN TAK BAKU PADA NORMALISASI BAHASA PERCAKAPAN DI MEDIA SOSIAL”, jicon, vol. 9, no. 1, pp. 65-73, Mar. 2021.
Section
Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.