Text-based tagging of Malay Hansard document / Mohd Razif Abd Jalil
In natural language processing, part-of-speech tagging plays a vital role. It is a significant condition for putting a human language on the computer science track. Before developing a part-of-speech tagger, a tag set is required for that language. This project is about the rule based part-of-speech...
Saved in:
| Main Author: | |
|---|---|
| Format: | Thesis |
| Language: | en |
| Published: |
2012
|
| Subjects: | |
| Online Access: | https://ir.uitm.edu.my/id/eprint/98198/1/98198.pdf https://ir.uitm.edu.my/id/eprint/98198/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In natural language processing, part-of-speech tagging plays a vital role. It is a significant condition for putting a human language on the computer science track. Before developing a part-of-speech tagger, a tag set is required for that language. This project is about the rule based part-of-speech tagging system for Malay language in Malay hansard document and a tag set that helps in the development of a Parser for the said language. The tagged word will compare with a text with manually tagging each word. The context free grammar will attach with the word that have more than one possible word class to perform a better result of tagging. A very simple architecture is applied that gives reasonably good accuracy. The result shows that 1.37 percent of hansard dictionary with highest frequency helps to tagging more than 55 percent words in hansard document. |
|---|
