Build Synonym Set Al-Quran Vocabulary with WordNet Approach

  • Laras Gupitasari Telkom University
Keywords: Quran, Synonym Set, Thesaurus, WordNet, Clustering


Research using Quran in of computational linguistics is very interesting and useful from the importance of Quran for Muslims because it is holy book for them. Due to the current lack of language resources for Quran research, this research aims to build a synonym set of Quran that can be used as a prototype to build a Quranic Thesaurus. This study only uses nouns as a dataset, and approach using WordNet and English translations of the Quran. In grouping words to produce synonym sets, this study using the hierarchical clustering method. Finally, the evaluation of the results of the synset is done by the F-Measure method.


[1] G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller, “Introduction to wordnet: An on-line lexical database,” Int. J. Lexicogr., 1990.
[2] O. Bilgin, Ö. Çetinoğlu, O. Cetinoglu, and K. Oflazer, “Building a wordnet for Turkish,” Rom. J. Inf. Sci. Technol., vol. 7, no. 1–2, pp. 163–172, 2004.
[3] S. Elkateb et al., “Building a WordNet for Arabic,” in Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006, 2006, pp. 29–34.
[4] S. Wang and F. Bond, “Building the chinese open wordnet (cow): Starting from core synsets,” in Proceedings of the 11th Workshop on Asian Language Resources, 2013, pp. 10–18.
[5] A. Saputra and others, “Building synsets for Indonesian Wordnet with monolingual lexical resources,” in 2010 International Conference on Asian Language Processing, 2010, pp. 297–300.
[6] Sahabuddin, M. Q. Shihab, and Sahabuddin, Ensiklopedia Al-Qur’an: kajian kosakata. Lentera Hati, 2007.
[7] A. Al-Thubaity, M. Khan, M. Al-Mazrua, and M. Al-Mousa, “New language resources for arabic: corpus containing more than two million words and a corpus processing tool,” in 2013 International Conference on Asian Language Processing, 2013, pp. 67–70.
[8] T. Redaksi, “Tesaurus Bahasa Indonesia Pusat Bahasa,” Pus. Bahasa, Dep. Pendidik. Nas., 2008.
[9] M. Waite, Oxford thesaurus of English. Oxford University Press, 2009.
[10] S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, Data preprocessing in predictive data mining, vol. 34. Springer, 2019.
[11] K. N. PUTRI, 2019, “Clustering Ekstraksi Synonym Set Bahasa Indonesia Menggunakan Agglomerative Hierarchical Clustering,”, Skripsi, Program Studi Informatika, Universitas Telkom, Bandung.
[12] Y. Rani and H. Rohil, “A study of hierarchical clustering algorithm,” Int. J. Inf. Comput. Technol., p. 113, 2013.
[13] T. Pedersen, S. Patwardhan, and J. Michelizzi, “WordNet:: Similarity: measuring the relatedness of concepts,” in Demonstration papers at HLT-NAACL 2004, 2004, pp. 38–41.
[14] K. Samhith, S. A. Tilak, and G. Panda, “Word sense disambiguation using wordnet lexical categories,” in 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), 2016, pp. 1664–1666.
[15] D. A. Wiranata, M. A. Bijaksana, and M. S. Mubarok, “Quranic Concepts Similarity Based on Lexical Database,” in 2018 6th International Conference on Information and Communication Technology (ICoICT), 2018, pp. 264–268.
[16] S. Chormunge and S. Jena, “Efficiency and Effectiveness of Clustering Algorithms for High Dimensional Data,” Int. J. Comput. Appl., 2015.