NUWT: Jawi-specific Buckwalter corpus for Malays word tokenization
This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi.This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently comprehe...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universiti Utara Malaysia
2016
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/18485/1/JICT%2015%20%201%202016%20%20107%E2%80%93131.pdf http://repo.uum.edu.my/18485/ http://www.jict.uum.edu.my/images/pdf3/vol15no1/51jict1512016.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|