Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification

Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has...

Full description

Saved in:
Bibliographic Details
Main Authors: Aurangzeb , khan, Baharum, Baharudin, Khairullah, khan
Format: Conference or Workshop Item
Published: 2010
Subjects:
Online Access:http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf
http://eprints.utp.edu.my/6431/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.6431
record_format eprints
spelling my.utp.eprints.64312017-01-19T08:24:34Z Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification Aurangzeb , khan Baharum, Baharudin Khairullah, khan T Technology (General) Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification. 2010 Conference or Workshop Item PeerReviewed application/pdf http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf Aurangzeb , khan and Baharum, Baharudin and Khairullah, khan (2010) Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification. In: 2010 Second International Conference on Computer Engineering and Applications. http://eprints.utp.edu.my/6431/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
topic T Technology (General)
spellingShingle T Technology (General)
Aurangzeb , khan
Baharum, Baharudin
Khairullah, khan
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
description Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification.
format Conference or Workshop Item
author Aurangzeb , khan
Baharum, Baharudin
Khairullah, khan
author_facet Aurangzeb , khan
Baharum, Baharudin
Khairullah, khan
author_sort Aurangzeb , khan
title Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
title_short Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
title_full Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
title_fullStr Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
title_full_unstemmed Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
title_sort efficient feature selection and domain relevance term weighting method for document classification
publishDate 2010
url http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf
http://eprints.utp.edu.my/6431/
_version_ 1738655486992973824
score 13.244368