Morphological segmentation and analysis of Bangla text
This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Faculty of Computing, Universiti Teknologi Malaysia
2016
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf http://repo.uum.edu.my/21406/ http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.uum.repo.21406 |
---|---|
record_format |
eprints |
spelling |
my.uum.repo.214062017-04-05T02:50:06Z http://repo.uum.edu.my/21406/ Morphological segmentation and analysis of Bangla text Saha, G C Saha, Hasi Che Mat, Ruzinoor Khan, Nur Hossain Sarker, Bappa QA75 Electronic computers. Computer science This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly. Faculty of Computing, Universiti Teknologi Malaysia 2016 Article PeerReviewed application/pdf en http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf Saha, G C and Saha, Hasi and Che Mat, Ruzinoor and Khan, Nur Hossain and Sarker, Bappa (2016) Morphological segmentation and analysis of Bangla text. International Journal of Interactive Digital Media (IJIDM), 4 (3). pp. 15-20. ISSN 2289-4098 http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Institutionali Repository |
url_provider |
http://repo.uum.edu.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Saha, G C Saha, Hasi Che Mat, Ruzinoor Khan, Nur Hossain Sarker, Bappa Morphological segmentation and analysis of Bangla text |
description |
This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of
unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was
developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will
work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly. |
format |
Article |
author |
Saha, G C Saha, Hasi Che Mat, Ruzinoor Khan, Nur Hossain Sarker, Bappa |
author_facet |
Saha, G C Saha, Hasi Che Mat, Ruzinoor Khan, Nur Hossain Sarker, Bappa |
author_sort |
Saha, G C |
title |
Morphological segmentation and analysis of Bangla text |
title_short |
Morphological segmentation and analysis of Bangla text |
title_full |
Morphological segmentation and analysis of Bangla text |
title_fullStr |
Morphological segmentation and analysis of Bangla text |
title_full_unstemmed |
Morphological segmentation and analysis of Bangla text |
title_sort |
morphological segmentation and analysis of bangla text |
publisher |
Faculty of Computing, Universiti Teknologi Malaysia |
publishDate |
2016 |
url |
http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf http://repo.uum.edu.my/21406/ http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf |
_version_ |
1644283228279799808 |
score |
13.211869 |