Sarcasm detection and classification to support sentiment analysis : A study in malay social media
The classification of users' sentiment from social media data can be used to determine public opinion on certain issues. The presence of sarcasm in text may hamper the performance of sentiment analysis. This thesis presents research work conducted on sarcasm detection and classification to supp...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2017
|
Subjects: | |
Online Access: | https://eprints.ums.edu.my/id/eprint/37665/1/24%20PAGES.pdf https://eprints.ums.edu.my/id/eprint/37665/2/FULLTEXT.pdf https://eprints.ums.edu.my/id/eprint/37665/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.ums.eprints.37665 |
---|---|
record_format |
eprints |
spelling |
my.ums.eprints.376652023-11-29T02:00:37Z https://eprints.ums.edu.my/id/eprint/37665/ Sarcasm detection and classification to support sentiment analysis : A study in malay social media Mohd Suhairi Md Suhaimin HM711-806 Groups and organizations The classification of users' sentiment from social media data can be used to determine public opinion on certain issues. The presence of sarcasm in text may hamper the performance of sentiment analysis. This thesis presents research work conducted on sarcasm detection and classification to support sentiment analysis. A Malay social media dataset, specifically focused on economic and political domain, was acquired from public comments posted on Facebook. The proposed work consists of two phases: (i) sarcasm detection and (ii) sentiment analysis with sarcasm detection and classification. In the first phase, the development of a mechanism for detecting sarcasm on bilingual data was explored. To achieve this, a feature extraction process was proposed to identify sarcasm features. Five feature categories of that can be extracted using natural language processing were considered: lexical, pragmatic, prosodic, syntactic and idiosyncratic. A non-linear Support Vector Machines classifier was employed to measure the performance of the features using the adopted evaluation metric, average Fmeasure. The best-performing features were then used as input for the second phase. In the second phase, a framework for sentiment analysis that considers sarcasm detection and classification was proposed. The framework consists of six modules, namely preprocessing, feature extraction, feature selection, sentiment classification, sarcasm detection and classification, and actual sentiment classification. Results obtained from the evaluation conducted demonstrate that the proposed features and framework are able to improve the performance of sentiment analysis. The best performance for sarcasm detection was found using a combination of syntactic, pragmatic, and prosodic features with an average F-measure score of 0.852. The best result of sentiment classification using the proposed framework, considering both sarcasm detection and classification, recorded an average F-measure score of 0. 905, outperforming the baseline sentiment classification score of 0.839. 2017 Thesis NonPeerReviewed text en https://eprints.ums.edu.my/id/eprint/37665/1/24%20PAGES.pdf text en https://eprints.ums.edu.my/id/eprint/37665/2/FULLTEXT.pdf Mohd Suhairi Md Suhaimin (2017) Sarcasm detection and classification to support sentiment analysis : A study in malay social media. Masters thesis, Universiti Malaysia Sabah. |
institution |
Universiti Malaysia Sabah |
building |
UMS Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sabah |
content_source |
UMS Institutional Repository |
url_provider |
http://eprints.ums.edu.my/ |
language |
English English |
topic |
HM711-806 Groups and organizations |
spellingShingle |
HM711-806 Groups and organizations Mohd Suhairi Md Suhaimin Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
description |
The classification of users' sentiment from social media data can be used to determine public opinion on certain issues. The presence of sarcasm in text may hamper the performance of sentiment analysis. This thesis presents research work conducted on sarcasm detection and classification to support sentiment analysis. A Malay social media dataset, specifically focused on economic and political domain, was acquired from public comments posted on Facebook. The proposed work consists of two phases: (i) sarcasm detection and (ii) sentiment analysis with sarcasm detection and classification. In the first phase, the development of a mechanism for detecting sarcasm on bilingual data was explored. To achieve this, a feature extraction process was proposed to identify sarcasm features. Five feature categories of that can be extracted using natural language processing were considered: lexical, pragmatic, prosodic, syntactic and idiosyncratic. A non-linear Support Vector Machines classifier was employed to measure the performance of the features using the adopted evaluation metric, average Fmeasure. The best-performing features were then used as input for the second phase. In the second phase, a framework for sentiment analysis that considers sarcasm detection and classification was proposed. The framework consists of six modules, namely preprocessing, feature extraction, feature selection, sentiment classification, sarcasm detection and classification, and actual sentiment classification. Results obtained from the evaluation conducted demonstrate that the proposed features and framework are able to improve the performance of sentiment analysis. The best performance for sarcasm detection was found using a combination of syntactic, pragmatic, and prosodic features with an average F-measure score of 0.852. The best result of sentiment classification using the proposed framework, considering both sarcasm detection and classification, recorded an average F-measure score of 0. 905, outperforming the baseline sentiment classification score of 0.839. |
format |
Thesis |
author |
Mohd Suhairi Md Suhaimin |
author_facet |
Mohd Suhairi Md Suhaimin |
author_sort |
Mohd Suhairi Md Suhaimin |
title |
Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
title_short |
Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
title_full |
Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
title_fullStr |
Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
title_full_unstemmed |
Sarcasm detection and classification to support sentiment analysis : A study in malay social media |
title_sort |
sarcasm detection and classification to support sentiment analysis : a study in malay social media |
publishDate |
2017 |
url |
https://eprints.ums.edu.my/id/eprint/37665/1/24%20PAGES.pdf https://eprints.ums.edu.my/id/eprint/37665/2/FULLTEXT.pdf https://eprints.ums.edu.my/id/eprint/37665/ |
_version_ |
1783877957807243264 |
score |
13.211869 |