A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution
Low-resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low-resource languages, specifically focusing on Malay-English, known as Manglish. Manglish speakers are primarily...
Saved in:
Main Authors: | Maskat, Ruhaila, Azman, Norazmiera Ayunie, Nulizairos, Nur Shaheera Shastera, Zahidin, Nurul Athirah, Mahadi, Adibah Humairah, Norshamsul, Siti Rubaya, Mohd Sharif, Mohd Mukhlis, Mahdin, Hairulnizam |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/10920/1/J17377_a3b15f369ba6e61ca5517eaf40899173.pdf http://eprints.uthm.edu.my/10920/ https://doi.org/10.1016/j.dib.2024.110034 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
A comparison of Manglish and Singlish lexis in blogs / Nadhiya binti Norizam
by: Norizam, Nadhiya
Published: (2014) -
A Survey on Forms of Visualization and Tools Used in Topic
Modelling
by: Ruhaila Maskat, Ruhaila Maskat, et al.
Published: (2023) -
A Survey on Forms of Visualization and Tools Used in Topic
Modelling
by: Maskat, Ruhaila, et al.
Published: (2023) -
Research and authorship responsibilities
by: Looi, L.M.
Published: (2012) -
An annotated image dataset for training mosquito species recognition system on human skin
by: Ong, Song Quan, et al.
Published: (2022)