Building Related Words for Quran Vocabulary Based on Distributional Similarity

  • Fedy Fahron Guntara Universitas Telkom
Keywords: Quran, cosine-similarity, word2vec, precision


The Quran is the Muslim holy book that contains so many words in it. This makes it difficult for ordinary people to find connections between words in the Quran. For example like the word مَّعْرُوف which has a connection with the word عَفَا because in the Quran the two words are often in one verse and also the two words have a connection in the meaning of forgiving is one of the good deeds. At present, there is no dictionary, encyclopedia or thesaurus in the vocabulary of the Quran that explains the interrelationships of words in the Quran. Therefore this study will discuss a connection with the words in the Qur'an and to further assist in finding the interrelations between verses. The method used in this study is a method with a distribution-based equality approach based on Continuous Bag of Word (CBOW). The use of the CBOW method produces a precision value of 98% based on the results of the system output with the correction from linguists


[1] M. Hasbullah, “Islam di Amerika: Sebuah Keajaiban Bernama 9/11.” Pikiran Rakyat.
[2] Sahabuddin, M. Q. Shihab, and Sahabuddin, Ensiklopedia Al-Qur’an: kajian kosakata. Lentera Hati, 2007.
[3] R. Mihalcea, C. Corley, and C. Strapparava, “Corpus-based and knowledge-based measures of text semantic similarity,” in Proceedings of the National Conference on Artificial Intelligence, 2006.
[4] A. W. Z. Nasution, M. A. Bijaksana, and S. Al Farab, “Analisis dan Implementasi Perhitungan Semantics Similarity Pada Ayat Al-Quran Dengan Pendekatan Word Alignment Berdasarkan Support Vector Regression,” eProceedings Eng., vol. 4, no. 2, 2017.
[5] M. M. Rani, M. A. Bijaksana, and S. Al Faraby, “Analisis Dan Implementasi Kesamaan Semantik Antar Teks Menggunakan Pendekatan Alignment Dan Vektor Semantik Pada Terjemahan Alquran,” eProceedings Eng., vol. 4, no. 2, 2017.
[6] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv Prepr. arXiv1301.3781, 2013.
[7] R. Baeza-Yates, B. Ribeiro-Neto, and others, Modern information retrieval, vol. 463. ACM press New York, 1999.
[8] B. Larsen and C. Aone, “Fast and effective text mining using linear-time document clustering,” 1999.
[9] A. Huang, “Similarity measures for text document clustering,” in New Zealand Computer Science Research Student Conference, NZCSRSC 2008 - Proceedings, 2008.
[10] I. Rozi, S. Pramono, and E. Dahlan, “Implementasi Opinion Mining (Analisis Sentimen) Untuk Ekstraksi Data Opini Publik Pada Perguruan Tinggi,” J. EECCIS, 2012.