CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS

Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the lessresourced language and English is experimented as the rich-re...

Full description

Saved in:
Bibliographic Details
Main Author: ZAMIN, NORSHUHANI
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf
http://utpedia.utp.edu.my/21305/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utp-utpedia.21305
record_format eprints
spelling my-utp-utpedia.213052021-09-16T22:05:45Z http://utpedia.utp.edu.my/21305/ CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS ZAMIN, NORSHUHANI Q Science (General) Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the lessresourced language and English is experimented as the rich-resourced language. The research is proposed to reduce the deadlock in Malay computational linguistic research due to the shortage of Malay tools and annotated corpus by exploiting stateof- the-art English tools. The aim of the research is to investigate a suitable crosslingual annotation projection based on word alignment of two languages with syntactical differences. A word alignment method known as MEW A (Malay-J;nglish Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure with little supervision is proposed. 2014-05 Thesis NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf ZAMIN, NORSHUHANI (2014) CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS. PhD thesis, Universiti Teknologi PETRONAS.
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Electronic and Digitized Intellectual Asset
url_provider http://utpedia.utp.edu.my/
language English
topic Q Science (General)
spellingShingle Q Science (General)
ZAMIN, NORSHUHANI
CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
description Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay is experimented as the lessresourced language and English is experimented as the rich-resourced language. The research is proposed to reduce the deadlock in Malay computational linguistic research due to the shortage of Malay tools and annotated corpus by exploiting stateof- the-art English tools. The aim of the research is to investigate a suitable crosslingual annotation projection based on word alignment of two languages with syntactical differences. A word alignment method known as MEW A (Malay-J;nglish Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure with little supervision is proposed.
format Thesis
author ZAMIN, NORSHUHANI
author_facet ZAMIN, NORSHUHANI
author_sort ZAMIN, NORSHUHANI
title CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
title_short CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
title_full CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
title_fullStr CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
title_full_unstemmed CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS
title_sort cross-lingual annotation projection for the development of malay corpus
publishDate 2014
url http://utpedia.utp.edu.my/21305/1/2014%20-COMPUTER%20%26%20INFORMATION%20SCIENCES%20-%20CROSS-LINGUAL%20ANNOTATION%20PROJECTION%20FOR%20THE%20DEVELOPMENT%20OF%20MALAY%20CORPOS%20-%20NORSHUHANI%20ZAMIN.pdf
http://utpedia.utp.edu.my/21305/
_version_ 1739832856329322496
score 13.211869