Sarcasm detection in Persian

Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as senti...

Full description

Saved in:
Bibliographic Details
Main Authors: Nezhad, Zahra Bokaee, Deihimi, Mohammad Ali
Format: Article
Language:English
Published: Universiti Utara Malaysia 2021
Subjects:
Online Access:http://repo.uum.edu.my/28124/1/JICT%2020%201%202021%201-20.pdf
http://repo.uum.edu.my/28124/
http://jict.uum.edu.my/index.php/currentissue
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as sentiment analysis and text summation. However, research on sarcasm detection in Persian is very limited. This paper investigated the sarcasm detection technique on Persian tweets by combining deep learning-based and machine learning-based approaches. Four sets of features that cover different types of sarcasm were proposed, namely deep polarity, sentiment, part of speech, and punctuation features. These features were utilized to classify the tweets as sarcastic and non sarcastic. In this study, the deep polarity feature was proposed by conducting a sentiment analysis using deep neural network architecture. In addition, to extract the sentiment feature, a Persian sentiment dictionary was developed,which consisted of four sentiment categories. The study also used a new Persian proverb dictionary in the preparation step to enhance the accuracy of the proposed model. The performance of the model is analysed using several standard machine learning algorithms. The results of the experiment showed that the method outperformed the baseline method and reached an accuracy of 80.82%. The study also examined the importance of each proposed feature set and evaluated its added value to the classification.