NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification

Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve deci...

Full description

Saved in:
Bibliographic Details
Main Authors: Yazdani, Sepideh Foroozan, Tan, Zhiyuan, Kakavand, Mohsen, Mustapha, Aida
Format: Article
Language:en
Published: Springer 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf
http://eprints.uthm.edu.my/5136/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1833417779077709824
author Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
author_facet Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
author_sort Yazdani, Sepideh Foroozan
building UTHM Library
collection Institutional Repository
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
continent Asia
country Malaysia
description Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve decision making. Nonetheless, such reviews are in the form of unstructured text, which requires natural language processing (NLP) in order to extract the sentiments. Accordingly, in this study we investigate the use of NLP tasks in effort to improve the performance of sentiment classification in evaluating the information content of financial news as an instrument in investment decision support system. At present, feature extraction approach is mainly based on the occurrence frequency of words. Therefore low-frequency linguistic features that could be critical in sentiment classification are typically ignored. In this research, we attempt to improve current sentiment analysis approaches for financial news classification by focusing on low-frequency but informative linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for sentiment classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy as compared to other types of feature sets.
format Article
id my.uthm.eprints-5136
institution Universiti Tun Hussein Onn Malaysia
language en
publishDate 2018
publisher Springer
record_format eprints
spelling my.uthm.eprints-51362022-01-06T02:29:16Z http://eprints.uthm.edu.my/5136/ NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification Yazdani, Sepideh Foroozan Tan, Zhiyuan Kakavand, Mohsen Mustapha, Aida QA76 Computer software TA Engineering (General). Civil engineering (General) TA329-348 Engineering mathematics. Engineering analysis Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve decision making. Nonetheless, such reviews are in the form of unstructured text, which requires natural language processing (NLP) in order to extract the sentiments. Accordingly, in this study we investigate the use of NLP tasks in effort to improve the performance of sentiment classification in evaluating the information content of financial news as an instrument in investment decision support system. At present, feature extraction approach is mainly based on the occurrence frequency of words. Therefore low-frequency linguistic features that could be critical in sentiment classification are typically ignored. In this research, we attempt to improve current sentiment analysis approaches for financial news classification by focusing on low-frequency but informative linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for sentiment classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy as compared to other types of feature sets. Springer 2018 Article PeerReviewed text en http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf Yazdani, Sepideh Foroozan and Tan, Zhiyuan and Kakavand, Mohsen and Mustapha, Aida (2018) NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification. WIRELESS NETWORKS. pp. 1-11. ISSN 1022-0038
spellingShingle QA76 Computer software
TA Engineering (General). Civil engineering (General)
TA329-348 Engineering mathematics. Engineering analysis
Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_full NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_fullStr NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_full_unstemmed NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_short NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_sort ngrampos a bigram-based linguistic and statistical feature process model for unstructured text classification
topic QA76 Computer software
TA Engineering (General). Civil engineering (General)
TA329-348 Engineering mathematics. Engineering analysis
url http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf
http://eprints.uthm.edu.my/5136/
url_provider http://eprints.uthm.edu.my/