System for retrieving similar sentences from Malay translation of the Al-Quran / Nur Farhana Rasip
Similar sentences can be computed by using Bigram Language Model. There are others researchers computed similar sentences but not using Malay documents. The problems are there are lack of study related to Malay documents for computing similar sentences. It will also produce huge amount of relevant d...
Saved in:
| Main Author: | |
|---|---|
| Format: | Thesis |
| Language: | en |
| Published: |
2019
|
| Subjects: | |
| Online Access: | https://ir.uitm.edu.my/id/eprint/109972/1/109972.pdf https://ir.uitm.edu.my/id/eprint/109972/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Similar sentences can be computed by using Bigram Language Model. There are others researchers computed similar sentences but not using Malay documents. The problems are there are lack of study related to Malay documents for computing similar sentences. It will also produce huge amount of relevant documents by using frequent words and topic words. The similar sentences also were scattered in the Al-Quran. So, the user will have a hard time to search what they want manually. There are two process that need to be conducted to retrieve similar sentences. The first process was pre-processing the data and used Bigram Language Model to retrieve similar sentences. Bigram Language Model will counts all 2-word-long subsequence or bigram that appear on data and build probability distribution of bigram. As the result, the prototype will display to user the similar sentences that match with user’s query. The precision and recall was high for every query. The benefits of this system was instead of a user manually searched for what they want in the Al-Quran by flipping through the translated documents, this system will help user to quickly enters a query word and retrieve all the match documents that are related to what they are looking for. |
|---|
