Morphological segmentation and analysis of Bangla text
This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Faculty of Computing, Universiti Teknologi Malaysia
2016
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf http://repo.uum.edu.my/21406/ http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of
unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was
developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will
work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly. |
---|