Enhanced aspect level opinion mining knowledge extraction and representation
There is a need to find more effective techniques to extract, classify, represent and summarize customers’ online opinions on products and services for better sentiment analysis. The aim of this thesis is to enhance aspect level opinion extraction and representation. This study uses SentiWordNet lex...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2015
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/54773/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:94313 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There is a need to find more effective techniques to extract, classify, represent and summarize customers’ online opinions on products and services for better sentiment analysis. The aim of this thesis is to enhance aspect level opinion extraction and representation. This study uses SentiWordNet lexical resource which is specifically built for opinion mining and widely used in sentiment analysis. This research introduces an approach using adjectives, verbs, adverbs and nouns (AVAN) which analyses all opinion word types for sentiment analysis and not only limited to adjectives and adverbs as have been conventionally done. SentiWordNet is used in this thesis to identify and analyze all word types for opinion extraction and representation. Opinion representation is enhanced by capturing key elements of opinions into predicates that consists of opinion word, strength, score and category in order to improve the opinion representation and classification. Then it further enhances the mining by introducing opinion accounting which summarizes opinion scores at various group levels. In addition, this thesis introduces a new concept called opinion strength which classifies opinions into degrees. An enhanced score is assigned to opinion based on the strength at which these opinions are expressed. Furthermore, as opinions are fuzzy in nature, this study shows that fuzzy logic is an effective technique to address opinion vagueness since human-like logic is fuzzy. This is important as opinions should not only be categorized in classical Boolean sentiments. This study identifies SentiWordNet, AVAN, Opinion Strength and fuzzy logic as classification features to classify customer reviews into a 5-class prediction model (Excellent, Good, Fair, Poor and Very Poor ). The results show an accuracy of 92% using Sequential Minimal Optimization classifier for these features, outperforming previous works that implemented Support Vector Machine and Logistic Regression. Moreover, combination of AVAN, Opinion Strength and fuzzy logic outperformed SentiWordNet alone by a 30% accuracy. |
---|