Semantic-based question answering framework for fuzzy factoid answer from Thai texts
Text is an important human knowledge source. The question-answering system can retrieve the fact from the source of knowledge and provide the answer to the user. Translating the text to the knowledge base is a very challenge task and complicated process. Thai text can be a form of character stream w...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2024
|
Subjects: | |
Online Access: | https://etd.uum.edu.my/11490/1/depositpermission.pdf https://etd.uum.edu.my/11490/2/s900995_01.pdf https://etd.uum.edu.my/11490/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Text is an important human knowledge source. The question-answering system can retrieve the fact from the source of knowledge and provide the answer to the user. Translating the text to the knowledge base is a very challenge task and complicated process. Thai text can be a form of character stream written continuously without any punctuation or marker to separate each word and each sentence in a paragraph. This research is aim to develop a semantic base question-answering framework that can handle the fuzzy factoid and target the knowledge source to Thai text. In building a Thai question-answering system, Thai morphological analysis is an important component to process Thai text. Ellipsis and anaphora resolution in Thai text is also the needed process for constructing the complete fact from Thai text. Thai semantic parser is the core component to construct the knowledge base by extracting the fact from Thai text into the semantic frame structure. The methodology of this research is divided into 4 steps. First is building the accurate Thai morphological analysis: Thai word segmentation and Thai EDU segmentation. The second is to develop the ellipsis and anaphora resolution for Thai text to achieve the goal that is creating the complete fact in Thai EDU segmentation. The third is to develop the semantic parser to build the knowledge base that transforms the Thai text into a semantic frame representation. Forth is developed the answer extraction for the question answering system with fuzzy matching to handle the fuzzy factoid. From the pipeline of the processes, the semanticbased question answering system performs high precision and recall to 0.9892 and 0.9484. In conclusion, anaphora and ellipsis resolution are crucial for achieving precise semantic construction, while fuzzy matching significantly enhances answer extraction recall. Together, these components are essential for building robust "What" and "How many" question answering systems |
---|