CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the lessresourced language and English is experimented as the rich-re...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf http://utpedia.utp.edu.my/21305/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utp-utpedia.21305 |
---|---|
record_format |
eprints |
spelling |
my-utp-utpedia.213052021-09-16T22:05:45Z http://utpedia.utp.edu.my/21305/ CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS ZAMIN, NORSHUHANI Q Science (General) Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the lessresourced language and English is experimented as the rich-resourced language. The research is proposed to reduce the deadlock in Malay computational linguistic research due to the shortage of Malay tools and annotated corpus by exploiting stateof- the-art English tools. The aim of the research is to investigate a suitable crosslingual annotation projection based on word alignment of two languages with syntactical differences. A word alignment method known as MEW A (Malay-J;nglish Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure with little supervision is proposed. 2014-05 Thesis NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf ZAMIN, NORSHUHANI (2014) CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS. PhD thesis, Universiti Teknologi PETRONAS. |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Electronic and Digitized Intellectual Asset |
url_provider |
http://utpedia.utp.edu.my/ |
language |
English |
topic |
Q Science (General) |
spellingShingle |
Q Science (General) ZAMIN, NORSHUHANI CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS |
description |
Cross-lingual annotation projection methods can benefit from rich-resourced
languages to improve the performance of Natural Language Processing (NLP) tasks in
less-resourced languages. In this research, Malay is experimented as the lessresourced
language and English is experimented as the rich-resourced language. The
research is proposed to reduce the deadlock in Malay computational linguistic
research due to the shortage of Malay tools and annotated corpus by exploiting stateof-
the-art English tools. The aim of the research is to investigate a suitable crosslingual
annotation projection based on word alignment of two languages with
syntactical differences. A word alignment method known as MEW A (Malay-J;nglish
Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure
with little supervision is proposed. |
format |
Thesis |
author |
ZAMIN, NORSHUHANI |
author_facet |
ZAMIN, NORSHUHANI |
author_sort |
ZAMIN, NORSHUHANI |
title |
CROSS-LINGUAL ANNOTATION PROJECTION
FOR THE DEVELOPMENT OF MALAY CORPUS |
title_short |
CROSS-LINGUAL ANNOTATION PROJECTION
FOR THE DEVELOPMENT OF MALAY CORPUS |
title_full |
CROSS-LINGUAL ANNOTATION PROJECTION
FOR THE DEVELOPMENT OF MALAY CORPUS |
title_fullStr |
CROSS-LINGUAL ANNOTATION PROJECTION
FOR THE DEVELOPMENT OF MALAY CORPUS |
title_full_unstemmed |
CROSS-LINGUAL ANNOTATION PROJECTION
FOR THE DEVELOPMENT OF MALAY CORPUS |
title_sort |
cross-lingual annotation projection
for the development of malay corpus |
publishDate |
2014 |
url |
http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf http://utpedia.utp.edu.my/21305/ |
_version_ |
1739832856329322496 |
score |
13.211869 |