Component-based stemming engine for malay text / Juhari ljam

Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the...

Full description

Saved in:
Bibliographic Details
Main Author: Juhari, ljam
Format: Thesis
Published: 2003
Subjects:
Online Access:http://studentsrepo.um.edu.my/8925/4/juhari.pdf
http://studentsrepo.um.edu.my/8925/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1831435211788779520
author Juhari, ljam
author_facet Juhari, ljam
author_sort Juhari, ljam
building UM Library
collection Institutional Repository
content_provider Universiti Malaya
content_source UM Student Repository
continent Asia
country Malaysia
description Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the English and the other foreign languages, but it is still new for the Malay text. How ever most of them did not given any meaning of the development or application. This is because it cannot be reused for the other applications. These projects are studied and a new algorithm is being proposed to improve the performance of the stemming process. And the most importance of this project is to propose a new technology, which is using component based. With it, a lot of applications may derive from the component. It is because the main reason of using component base is it can be reusable. So that for those who like to build a system which is have a relationship to the IR or word stemming, not need to build it anymore for the stemming engine. The developer has just to use the component engine and get the output easily. How ever this project is proposed for a specific domain that will be covered for the generic Malay words.
format Thesis
id my.um.stud-8925
institution Universiti Malaya
publishDate 2003
record_format eprints
spelling my.um.stud-89252019-08-26T19:25:20Z Component-based stemming engine for malay text / Juhari ljam Juhari, ljam QA76 Computer software T Technology (General) Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the English and the other foreign languages, but it is still new for the Malay text. How ever most of them did not given any meaning of the development or application. This is because it cannot be reused for the other applications. These projects are studied and a new algorithm is being proposed to improve the performance of the stemming process. And the most importance of this project is to propose a new technology, which is using component based. With it, a lot of applications may derive from the component. It is because the main reason of using component base is it can be reusable. So that for those who like to build a system which is have a relationship to the IR or word stemming, not need to build it anymore for the stemming engine. The developer has just to use the component engine and get the output easily. How ever this project is proposed for a specific domain that will be covered for the generic Malay words. 2003 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/8925/4/juhari.pdf Juhari, ljam (2003) Component-based stemming engine for malay text / Juhari ljam. Undergraduates thesis, University of Malaya. http://studentsrepo.um.edu.my/8925/
spellingShingle QA76 Computer software
T Technology (General)
Juhari, ljam
Component-based stemming engine for malay text / Juhari ljam
title Component-based stemming engine for malay text / Juhari ljam
title_full Component-based stemming engine for malay text / Juhari ljam
title_fullStr Component-based stemming engine for malay text / Juhari ljam
title_full_unstemmed Component-based stemming engine for malay text / Juhari ljam
title_short Component-based stemming engine for malay text / Juhari ljam
title_sort component-based stemming engine for malay text / juhari ljam
topic QA76 Computer software
T Technology (General)
url http://studentsrepo.um.edu.my/8925/4/juhari.pdf
http://studentsrepo.um.edu.my/8925/
url_provider http://studentsrepo.um.edu.my/