Weighted inverse document frequency and vector space model for hadith search engine

Septya Egho Pratama; Wahyudin Darmalaksana; Dian Sa'adillah Maylawati; Hamdan Sugilar; Teddy Mantoro; Muhammad Ali Ramdhani

doi:10.11591/ijeecs.v18.i2.pp1004-1014

Weighted inverse document frequency and vector space model for hadith search engine

Septya Egho Pratama, Wahyudin Darmalaksana, Dian Sa'adillah Maylawati, Hamdan Sugilar, Teddy Mantoro, Muhammad Ali Ramdhani

Abstract

Hadith is the second source of Islamic law after Qur’an which make many types and references of hadith need to be studied. However, there are not many Muslims know about it and many even have difficulties in studying hadiths. This study aims to build a hadith search engine from reliable source by utilizing Information Retrieval techniques. The structured representation of the text that used is Bag of Word (1-term) with the Weighted Inverse Document Frequency (WIDF) method to calculate the frequency of occurrence of each term before being converted in vector form with the Vector Space Model (VSM). Based on the experiment results using 380 texts of hadith, the recall value of WIDF and VSM is 96%, while precision value is just around 35.46%. This is because the structured representation for text that used is bag of words (1-gram) that can not maintain the meaning of text well).

Keywords

Classification; Convolutional neural network; Deep learning; Glove; Indonesian language process; Natural language processing; Text mining

Full Text:

PDF

DOI: http://doi.org/10.11591/ijeecs.v18.i2.pp1004-1014

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

IJEECS visitor statistics

Username
Password
Remember me