Morphological segmentation and analysis of Bangla text

This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand...

Full description

Saved in:
Bibliographic Details
Main Authors: Saha, G C, Saha, Hasi, Che Mat, Ruzinoor, Khan, Nur Hossain, Sarker, Bappa
Format: Article
Language:English
Published: Faculty of Computing, Universiti Teknologi Malaysia 2016
Subjects:
Online Access:http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf
http://repo.uum.edu.my/21406/
http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.21406
record_format eprints
spelling my.uum.repo.214062017-04-05T02:50:06Z http://repo.uum.edu.my/21406/ Morphological segmentation and analysis of Bangla text Saha, G C Saha, Hasi Che Mat, Ruzinoor Khan, Nur Hossain Sarker, Bappa QA75 Electronic computers. Computer science This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly. Faculty of Computing, Universiti Teknologi Malaysia 2016 Article PeerReviewed application/pdf en http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf Saha, G C and Saha, Hasi and Che Mat, Ruzinoor and Khan, Nur Hossain and Sarker, Bappa (2016) Morphological segmentation and analysis of Bangla text. International Journal of Interactive Digital Media (IJIDM), 4 (3). pp. 15-20. ISSN 2289-4098 http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Saha, G C
Saha, Hasi
Che Mat, Ruzinoor
Khan, Nur Hossain
Sarker, Bappa
Morphological segmentation and analysis of Bangla text
description This paper deals with lexicon and system development for word segmentation in Bangla language.Our goal in this paper is to develop a morphological segmentation algorithm that can work well for Bangla and to address the problem of unsupervised word segmentation across different languages.From a hand-corrected Bangla corpus, 5000 popular words were segmented into suffixes, prefixes and roots manually.These were the sample lexicon used as seed for next step. A system was developed using C language to automate the Segmentation process based on hand made lexical database.The System was evaluated on several pages of Bangla text and achieved a success rate of about 83.05%.In our observation the system will work with full success if twice the volume of lexicon database and this system may have a huge impact particularly to learn and use Bangla for the people which will enhance their socio-economic life greatly.
format Article
author Saha, G C
Saha, Hasi
Che Mat, Ruzinoor
Khan, Nur Hossain
Sarker, Bappa
author_facet Saha, G C
Saha, Hasi
Che Mat, Ruzinoor
Khan, Nur Hossain
Sarker, Bappa
author_sort Saha, G C
title Morphological segmentation and analysis of Bangla text
title_short Morphological segmentation and analysis of Bangla text
title_full Morphological segmentation and analysis of Bangla text
title_fullStr Morphological segmentation and analysis of Bangla text
title_full_unstemmed Morphological segmentation and analysis of Bangla text
title_sort morphological segmentation and analysis of bangla text
publisher Faculty of Computing, Universiti Teknologi Malaysia
publishDate 2016
url http://repo.uum.edu.my/21406/1/IJIDM%204%203%202016%20%2015%2020.pdf
http://repo.uum.edu.my/21406/
http://ijidm.org/wp-content/uploads/IJIDM-04-03-03.pdf
_version_ 1644283228279799808
score 13.211869