Semantic query translation for Quran content retrieval / Mohd Amin Mohd Yunus
Existing Quran retrieval system suffers from low precision and recall mainly due to the query technique used particularly in cross language retrieval (CLIR). As query technique is one of the indicator for Quran retrieval information performance in CLIR, in order to improve the retrieval, query te...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2014
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/9169/1/yunus_corrected_17.12.2014.pdf http://studentsrepo.um.edu.my/9169/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Existing Quran retrieval system suffers from low precision and recall mainly due to the
query technique used particularly in cross language retrieval (CLIR). As query
technique is one of the indicator for Quran retrieval information performance in CLIR,
in order to improve the retrieval, query technique could be enhanced using combination
of semantic, stemming and translation techniques. The main purpose of the study is to
automate the retrieval process of Quranic verses across multilingual context (Malay,
English, Arabic). The specific objectives of the study are i) to develop algorithm of
query expansion based on stemming, semantic and translation query in cross language
Quranic retrieval, ii) to develop a prototype of cross language Quranic retrieval system
for Malay, English and Arabic based on Stemmed Semantic Translated Query (StSTQ),
iii) to evaluate the performance of the prototype system of the cross language Quranic
retrieval. This research introduced the conflated methods which contain query
translation, lexical semantic technique and stemming algorithm. The appropriate
stemming approach applies to the process of translated query and lexical semantic to
retrieve relevant results. Procedures involving conflated techniques of query expansion
in cross language information retrieval were formed. Experiments on Malay, Arabic
and English collection pertaining to Quran content were conducted. The retrieval
performance of the Quran retrieval system was evaluated. In the first stage, QR
translates the words into the required languages and looks up their synonyms and
stemmed synonyms to expand the single query with relevant stemmed synonyms. By
having the expanded query, QR presents the required and relevant results in SpaceTree
visualization to display each word with its relevant results in a proper manner. This
study involved important tests to improve the single translated 36 queries in linking
them to the matched relevant synonyms and stemmed synonyms. The query expansion
through semantic technique generates the necessary and comprehensive Quran information and three principal empirical experiments were conducted for retrieving
relevant verses from the Quran text in Malay, Arabic and English. The experiments
consisted of StSTQ (Stemmed and Translated Semantic Query), StTQ (Stemmed
Translated Query) and STQ (Semantic Translated Query) for English, Arabic and
Malay. In the analysis of English verses collection, StSTQ has indicated an average
precision percentage of 77.52% (in keyword, K) and 84.12% (in Queryword, Q) and
while average recall percentage of 97.83% (K) and 96.19% (Q) respectively. For the
Arabic verses collection, StSTQ has shown an average precision percentage of 80.03%
(K) and 85.84% (Q) while the average recall percentage is 95.24% (K) and 96.94% (Q)
respectively. For the Malay verses collection, StSTQ results in average precision
percentage of 80.96% (K) and 85.26% (Q) while average recall percentage is 98.68%
(K) and 97.27% (Q) respectively. The significance of the study is to develop QR
system that improves the CLIR performance and evaluation based on the better query
expansion technique. |
---|