A language identifier for Indonesian and Malay text document

There is huge growth of online text documents in the Internet today. We can easily find documents written in languages from all over part of the just from a single click. Increasing number of online text document in Internet makes the increased availability of information on the Internet. In fact th...

Full description

Saved in:
Bibliographic Details
Main Authors: Indra, Z., Jaafar, J., Zamin, N., Bakar, Z.A.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2016
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84995701431&doi=10.1109%2fISMSC.2015.7594040&partnerID=40&md5=d9715785f362d63c5eefd4f58185acc8
http://eprints.utp.edu.my/30800/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.30800
record_format eprints
spelling my.utp.eprints.308002022-03-25T07:33:30Z A language identifier for Indonesian and Malay text document Indra, Z. Jaafar, J. Zamin, N. Bakar, Z.A. There is huge growth of online text documents in the Internet today. We can easily find documents written in languages from all over part of the just from a single click. Increasing number of online text document in Internet makes the increased availability of information on the Internet. In fact that none in the world can understand all languages of the digital documents. Hence, there is a significant need to have a language identifier to assist user to understand the information. Up to now, the language identification is more focused in European languages and still limited for Asian languages. Whilst the research of language identification for similar languages from popular languages has attracted the attention of many researchers. In this research, a new language identification for language with similar topology, Malay and Indonesian language, is proposed. The algorithm is experimented on a set of Indonesian and Malay text documents to support the limited research of language identification for Asian language. An experiment done on 100 Indonesian and Malay text documents has produced a number of satisfactorily accurate results. © 2015 IEEE. Institute of Electrical and Electronics Engineers Inc. 2016 Conference or Workshop Item NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-84995701431&doi=10.1109%2fISMSC.2015.7594040&partnerID=40&md5=d9715785f362d63c5eefd4f58185acc8 Indra, Z. and Jaafar, J. and Zamin, N. and Bakar, Z.A. (2016) A language identifier for Indonesian and Malay text document. In: UNSPECIFIED. http://eprints.utp.edu.my/30800/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description There is huge growth of online text documents in the Internet today. We can easily find documents written in languages from all over part of the just from a single click. Increasing number of online text document in Internet makes the increased availability of information on the Internet. In fact that none in the world can understand all languages of the digital documents. Hence, there is a significant need to have a language identifier to assist user to understand the information. Up to now, the language identification is more focused in European languages and still limited for Asian languages. Whilst the research of language identification for similar languages from popular languages has attracted the attention of many researchers. In this research, a new language identification for language with similar topology, Malay and Indonesian language, is proposed. The algorithm is experimented on a set of Indonesian and Malay text documents to support the limited research of language identification for Asian language. An experiment done on 100 Indonesian and Malay text documents has produced a number of satisfactorily accurate results. © 2015 IEEE.
format Conference or Workshop Item
author Indra, Z.
Jaafar, J.
Zamin, N.
Bakar, Z.A.
spellingShingle Indra, Z.
Jaafar, J.
Zamin, N.
Bakar, Z.A.
A language identifier for Indonesian and Malay text document
author_facet Indra, Z.
Jaafar, J.
Zamin, N.
Bakar, Z.A.
author_sort Indra, Z.
title A language identifier for Indonesian and Malay text document
title_short A language identifier for Indonesian and Malay text document
title_full A language identifier for Indonesian and Malay text document
title_fullStr A language identifier for Indonesian and Malay text document
title_full_unstemmed A language identifier for Indonesian and Malay text document
title_sort language identifier for indonesian and malay text document
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2016
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-84995701431&doi=10.1109%2fISMSC.2015.7594040&partnerID=40&md5=d9715785f362d63c5eefd4f58185acc8
http://eprints.utp.edu.my/30800/
_version_ 1738657158597181440
score 13.211869