Adoption of machine learning algorithm for analysing supporters and non supporters feedback on political posts / Ogunfolajin Maruff Tunde
Sentiment Analysis is a field that deals with the problem of identifying and extracting sentiment (or opinion) from data (particularly textual data). Studies have shown how user perception can have a strong influence on policies and decision-making processes in a place, society, and nation. This the...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2022
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/15304/1/Ogunfolajin.pdf http://studentsrepo.um.edu.my/15304/2/Ogunfolajin.pdf http://studentsrepo.um.edu.my/15304/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Sentiment Analysis is a field that deals with the problem of identifying and extracting sentiment (or opinion) from data (particularly textual data). Studies have shown how user perception can have a strong influence on policies and decision-making processes in a place, society, and nation. This thesis is based on the application of sentiment classification algorithm to tweet data with the goal of classifying messages based on the polarity of sentiment towards a particular topic (or subject matter). Political analysts often communicate with the public and exchange information through the social media platform. Their activities (otherwise termed cyber-trooping) could have either positive, negative, or neutral feedbacks (perceptions) in the public space. Thus, there is a need to automate the process of identifying and predicting (positive, negative, or neutral class) these cyber-trooping data. This work employed the use of machine learning approach. Four conventional classification algorithms: naïve bayes (NB), support vector machines (SVM), nearest neighbor (k-NN), and decision trees (J48) classifiers are implemented in identifying and categorizing tweet data of three political figures in Malaysia: Dato Seri Anwar, Dato Hadi Awang, and Lim Guang Eng, as either positive, negative, or neutral perceptions. The method was implemented using Java and the results of the simulation were evaluated using five standard performance metrics: accuracy, AUC, precision, recall, and f-Measure. The support vector machines (SVM) algorithm obtained the overall best results of 94.5% accuracy, 91.8% precision, 91.7% recall, and 91.1% f-Measure while the naïve bayes (NB) algorithm obtained the best AUC score of 0.944 with the tweet data of Dato Seri Anwar.
|
---|