STEMMING DOKUMEN TEKS BAHASA BALI DENGAN METODE RULE BASE APPROACH

Suku Bali adalah salah satu suku Bangsa Indonesia yang mayoritasnya berada di Pulau Bali, bahasa yang digunakan adalah Bahasa Bali dengan pedoman tiga tingkatan sor-singgih (tigang soroh) yaitu Basa Kasar, Basa Madia dan Basa Alus. Bahasa Bali juga memiliki imbuhan pangater, seselan dan pangiring. Untuk memudahkan pencarian kata dasar dalam Bahasa Bali perlu dilakukan proses stemming. Stemming adalah proses pemetaan dan penguraian bentuk dari suatu kata menjadi bentuk dasarnya. Proses stemming sangat penting didalam proses information retrieval system. Pada penelitian ini, dalam melakukan proses stemming Bahasa Bali menggunakan metode Rule Base Approach. Data yang digunakan dalam penelitian ini adalah kata dasar dalam Bahasa Bali sebanyak 376 kata dasar. Penelitian ini bertujuan untuk merancang aplikasi stemming yang tepat untuk melakukan stemming Bahasa Bali. Tahap awal dalam proses stemming Bahasa Bali adalah melakukan proses input, preprocessing, filtering, case folding dan tokenisasi. Masing-masing kata dilakukan proses stemming untuk menghilangkan imbuhan pangater, seselan, dan pangiring. Hasil dari penelitian menyatakan bahwa metode Rule Base Approach dapat digunakan untuk melakukan stemming teks Bahasa Bali, hal ini dapat dilihat dari hasil akurasi mencapai angka 77.82%. Tentunya dalam pengujian masih terdapat kegagalan yang disebabkan oleh kesalahan overstemming akibat dari proses stemming.

  • Putu Gede Surya Cipta Nugraha STMIK STIKOM Indonesia
  • Ni Wayan Wardani STMIK STIKOM Indonesia
Keywords: Balinese language, Rule Base Approach, Stemming, Overstemming

Abstract

 The Balinese are one of the ethnic groups of Indonesia, the majority of which are on the island of Bali, the language used is Balinese with three levels of sor-singgih (tigang soroh) guidelines, namely Basa Kasar, Basa Madia and Basa Alus. Balinese language also has the additions of pangater, seselan and pangiring. To facilitate the search for basic words in Balinese, a stemming process is needed. Stemming is the process of mapping and decomposing the form of a word into its basic form. The stemming process is very important in the information retrieval system process. In this study, the Balinese stemming process used the Rule Base Approach method. The data used in this study are 376 basic words in Balinese. This study aims to design an appropriate stemming application for Balinese stemming. The initial stage in the Balinese stemming process is to carry out the input process, preprocessing, filtering, case folding and tokenization. Each word is subjected to a stemming process to remove the additions of pangater, seselan, and pangiring. The results of the study indicate that the Rule Base Approach method can be used to stem Balinese texts, this can be seen from the results of the accuracy reaching 77.82%. Of course, in testing there are still failures caused by overstemming errors resulting from the stemming process.

References

[1] AFUAN, L. 2013. Stemming Dokumen Teks Bahasa Indonesia Menggunakan Algoritma Porter. Jurnal Telematika vol.6, 34-40. Jurnal Telematika

[2] AMIN, F., & ALFA RAZAQ, J. 2018. Implementasi Stemmer Bahasa Jawa dengan Metode Rule Base Approach pada Sistem Temu Kembali Informasi Dokumen Teks Berbahasa Jawa. Prosiding SENDI_U, 199–206, ISBN: 978-979-3649-99-3.

[3] FELDMAN, R., & SANGER. J. 2007. Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. New York: Cambridge University Press

[4] GUPTA, V., & LEHAL, G. S. 2009. A survey of text mining techniques and applications. Journal of Emerging Technologies in Web Intelligence, 1 (1), 60–76. https://doi.org/10.4304/jetwi.1.1.60-76.

[5] GUTERRES, A., GUNAWAN, & SANTOSO, J. 2019. Stemming Bahasa Tetun Menggunakan Pendekatan Rule Based. Teknika, 8(2), 142–147.

[6] HUSAIN, M.S., 2012. An Unsupervised Approach to Develop Stemmer. vol.1(2), pp.15–23. International Journal on Natural Language Computing (IJNLC).

[7] RAMASUBRAMANIAN, C., & RAMYA, R. 2013. Effective Pre-Processing Activities in Text Mining using Improved Porter’s Stemming Algorithm. International Journal of Advanced Research in Computer and Communication Engineering, 2(12), 4536–4538. www.ijarcce.com

[8] ROSHDI, A., & ROOHPARVAR, A. 2015. Review: Information Retrieval Techniques and Applications. International Journal of Computer Networks and Communications Security, 3(9), 373-377

[9] SHARMA, D. 2012. Stemming Algorithms: A Comparative Study and their Analysis. vol 4 (3), 2249-0868. International Journal of Applied Information Systems (IJAIS). www.ijais.org.
Published
2020-12-18