Data augmentation approach for language identification in imbalanced bilingual code-mixed social media datasets
Addressing the problem of language identification in code-mixed datasets poses notable challenges due to data scarcity and high confusability in bilingual contexts. These challenges are further amplified by the associated imbalance and noise characteristic of social media data, complicating efforts...
Saved in:
Main Authors: | Mohd Suhairi, Md Suhaimin, Mohd Hanafi, Ahmad Hijazi, Moung, Ervin Gubin, Mohd Azwan, Mohamad Hamza |
---|---|
Format: | Conference or Workshop Item |
Language: | English English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2023
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/40378/1/Data%20augmentation%20approach%20for%20language%20identification.pdf http://umpir.ump.edu.my/id/eprint/40378/2/Data%20augmentation%20approach%20for%20language%20identification%20in%20imbalanced%20bilingual%20code-mixed%20social%20media%20datasets_ABS.pdf http://umpir.ump.edu.my/id/eprint/40378/ https://doi.org/10.1109/IICAIET59451.2023.10292108 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Social Media Sentiment Analysis and Opinion Mining in Public Security : Taxonomy, Trend Analysis, Issues and Future Directions
by: Mohd Suhairi, Md Suhaimin, et al.
Published: (2023) -
Evolutionary deep belief networks with bootstrap sampling for imbalanced class datasets
by: Amri, A’inur A’fifah, et al.
Published: (2019) -
Social media sentiment analysis and opinion mining in public security: Taxonomy, trend analysis, issues and future directions
by: Mohd Suhairi Md Suhaimin, et al.
Published: (2023) -
An Improved Pheromone-Based Kohonen Self- Organising Map in Clustering and Visualising Balanced and Imbalanced Datasets
by: Ahmad, Azlin, et al.
Published: (2021) -
Learning with imbalanced datasets using fuzzy ARTMAP-based neural network models
by: Tan, S. C., et al.
Published: (2011)