Automated semantic query formulation for Quranic verse translation retrieval

With the exponential growth in the amount of data that is deposited on the web and in other data storage repositories daily, there is an increase in the global desire to retrieve that data in a more effective and efficient manner. There are quite a number of mechanisms through which this data is ret...

Description complète

Détails bibliographiques
Auteur principal: Yauri, Aliyu Rufai
Format: Thèse
Langue:anglais
Publié: 2014
Sujets:
Accès en ligne:http://psasir.upm.edu.my/id/eprint/40795/7/FSKTM%202014%208%20IR.pdf
_version_ 1846215436683706368
author Yauri, Aliyu Rufai
author_facet Yauri, Aliyu Rufai
author_sort Yauri, Aliyu Rufai
description With the exponential growth in the amount of data that is deposited on the web and in other data storage repositories daily, there is an increase in the global desire to retrieve that data in a more effective and efficient manner. There are quite a number of mechanisms through which this data is retrieved, such as search engines like Yahoo and Google among others, however most of the current information retrieval mechanisms on the web are based on a keyword search. A keyword search mostly retrieves information that is not relevant to the searched query due to problems such as semantic ambiguity of natural language. The user needs to know the exact keyword to use in order to retrieve the relevant information. To overcome this problem, several approaches have been researched, such as the query formulation, and most are based on a keyword and small fragment query. In this thesis, a study of the automatic semantic query formulation of natural language query to structured query is proposed. The proposed system in this thesis is referred to as AutoSQuR, meaning Automated Semantic Quran Retrieval. The proposed AutoSQuR attempts to semantically formulate complex natural language queries to triple representation and retrieve relevant verses from Holy Quran. The main contribution of this research is introduced a method to formulate semantic query automatically for natural language queries to structured queries using statistical machine learning technique. The contribution includes going beyond keywords and formulating small fragment queries to complex queries that can be a paragraph in length. Additionally the proposed system supports both categories of users who prefer suggestions from the system and those who prefer to reformulate their query in case the system fails to automatically formulate user queries. The proposed system provides suggestions to the user where either concepts are identified or not in the query. Another contribution is the use of ontology equivalent assertions due to the limitations of WordNet for the disambiguation of Islamic-related words. Finally, an experimental evaluation of AutoSQuR is implemented. The evaluation was based on measuring the performance of the proposed statistical machine learning technique with the existing approach in FREyA in terms of the percentage of queries that are semantically formulated correctly, and the effectiveness of the retrieved Quran verses. Evaluation has shown that the proposed approach outperformed the existing approach in FREyA. The statistical machine learning technique has shown improvement of 17.4% increases in comparison with the existing approach in FREyA in terms of correctness of the query formulation. Meanwhile, in the effectiveness of the retrieved verse, the proposed approach shows an improvement of 0.06 in terms of precision and 0.1 for recall.
format Thesis
id oai:psasir.upm.edu.my:40795
institution Universiti Putra Malaysia
language English
publishDate 2014
record_format eprints
spelling oai:psasir.upm.edu.my:407952015-09-29T04:15:02Z http://psasir.upm.edu.my/id/eprint/40795/ Automated semantic query formulation for Quranic verse translation retrieval Yauri, Aliyu Rufai With the exponential growth in the amount of data that is deposited on the web and in other data storage repositories daily, there is an increase in the global desire to retrieve that data in a more effective and efficient manner. There are quite a number of mechanisms through which this data is retrieved, such as search engines like Yahoo and Google among others, however most of the current information retrieval mechanisms on the web are based on a keyword search. A keyword search mostly retrieves information that is not relevant to the searched query due to problems such as semantic ambiguity of natural language. The user needs to know the exact keyword to use in order to retrieve the relevant information. To overcome this problem, several approaches have been researched, such as the query formulation, and most are based on a keyword and small fragment query. In this thesis, a study of the automatic semantic query formulation of natural language query to structured query is proposed. The proposed system in this thesis is referred to as AutoSQuR, meaning Automated Semantic Quran Retrieval. The proposed AutoSQuR attempts to semantically formulate complex natural language queries to triple representation and retrieve relevant verses from Holy Quran. The main contribution of this research is introduced a method to formulate semantic query automatically for natural language queries to structured queries using statistical machine learning technique. The contribution includes going beyond keywords and formulating small fragment queries to complex queries that can be a paragraph in length. Additionally the proposed system supports both categories of users who prefer suggestions from the system and those who prefer to reformulate their query in case the system fails to automatically formulate user queries. The proposed system provides suggestions to the user where either concepts are identified or not in the query. Another contribution is the use of ontology equivalent assertions due to the limitations of WordNet for the disambiguation of Islamic-related words. Finally, an experimental evaluation of AutoSQuR is implemented. The evaluation was based on measuring the performance of the proposed statistical machine learning technique with the existing approach in FREyA in terms of the percentage of queries that are semantically formulated correctly, and the effectiveness of the retrieved Quran verses. Evaluation has shown that the proposed approach outperformed the existing approach in FREyA. The statistical machine learning technique has shown improvement of 17.4% increases in comparison with the existing approach in FREyA in terms of correctness of the query formulation. Meanwhile, in the effectiveness of the retrieved verse, the proposed approach shows an improvement of 0.06 in terms of precision and 0.1 for recall. 2014-08 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/40795/7/FSKTM%202014%208%20IR.pdf Yauri, Aliyu Rufai (2014) Automated semantic query formulation for Quranic verse translation retrieval. PhD thesis, Universiti Putra Malaysia. Semantic web Cross-language information retrieval Qurʼan - Translating
spellingShingle Semantic web
Cross-language information retrieval
Qurʼan - Translating
Yauri, Aliyu Rufai
Automated semantic query formulation for Quranic verse translation retrieval
title Automated semantic query formulation for Quranic verse translation retrieval
title_full Automated semantic query formulation for Quranic verse translation retrieval
title_fullStr Automated semantic query formulation for Quranic verse translation retrieval
title_full_unstemmed Automated semantic query formulation for Quranic verse translation retrieval
title_short Automated semantic query formulation for Quranic verse translation retrieval
title_sort automated semantic query formulation for quranic verse translation retrieval
topic Semantic web
Cross-language information retrieval
Qurʼan - Translating
url http://psasir.upm.edu.my/id/eprint/40795/7/FSKTM%202014%208%20IR.pdf
url-record http://psasir.upm.edu.my/id/eprint/40795/
work_keys_str_mv AT yaurialiyurufai automatedsemanticqueryformulationforquranicversetranslationretrieval