Sentiment analysis and visualization of PADU from Malaysian X users using BERT
On the surface, the introduction of PADU might be met with varying degrees of acceptance with Malaysians but knowing the actual sentiment without any biases is hard. Sentiment analysis of a certain topic, which in this study is PADU is a complex field that involves scraping datasets and classifying...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | en |
| Published: |
College of Computing, Informatics, and Mathematics
2025
|
| Subjects: | |
| Online Access: | https://ir.uitm.edu.my/id/eprint/127573/1/127573.pdf https://ir.uitm.edu.my/id/eprint/127573/ https://fskmjebat.uitm.edu.my/pcmj/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | On the surface, the introduction of PADU might be met with varying degrees of acceptance with Malaysians but knowing the actual sentiment without any biases is hard. Sentiment analysis of a certain topic, which in this study is PADU is a complex field that involves scraping datasets and classifying them with great accuracy where if one were to do it manually, would inevitably introduce some sort of bias to the results. The project provides a solution to the matter by developing a sentiment analysis model and appropriately visualising the data and results. The dataset used is scraped from X using Tweet Harvest which consists of 88 datapoints which were further augmented to 440 datapoints. The model is developed using bidirectional encoder representations from transformers that are trained with the dataset gathered. The model follows the software development methodology using waterfall and is released on a web platform. The result of the model that was trained with the combination of collected and augmented datasets showed 87% accuracy, 87% Precision, 87% Recall and F1-score of 87% compared with the model that was trained using only the collected dataset. In the future, further improvement to this project will be seen in the form of bigger language support for the model and the collection of data from a wide variety of social media |
|---|
