Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2010
|
Subjects: | |
Online Access: | http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf http://eprints.utp.edu.my/6431/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utp.eprints.6431 |
---|---|
record_format |
eprints |
spelling |
my.utp.eprints.64312017-01-19T08:24:34Z Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification Aurangzeb , khan Baharum, Baharudin Khairullah, khan T Technology (General) Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification. 2010 Conference or Workshop Item PeerReviewed application/pdf http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf Aurangzeb , khan and Baharum, Baharudin and Khairullah, khan (2010) Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification. In: 2010 Second International Conference on Computer Engineering and Applications. http://eprints.utp.edu.my/6431/ |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Institutional Repository |
url_provider |
http://eprints.utp.edu.my/ |
topic |
T Technology (General) |
spellingShingle |
T Technology (General) Aurangzeb , khan Baharum, Baharudin Khairullah, khan Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
description |
Feature selection is of paramount concern in
document classification process which improves the efficiency
and accuracy of text classifier. Vector Space Model is used to
represent the “Bag of Word” BOW of the documents with
term weighting phenomena. Documents representing through
this model has some limitations that is, ignoring term
dependencies, structure and ordering of the terms in
documents. To overcome this problem semantic base feature
vector is proposed. That is used to extracts the concept of term,
co-occurring and associated terms using ontology. The
proposed method is applied on small documents dataset, which
shows that this method outperforms then term frequency/
inverse document frequency (TF-IDF) with BOW feature
selection method for text classification. |
format |
Conference or Workshop Item |
author |
Aurangzeb , khan Baharum, Baharudin Khairullah, khan |
author_facet |
Aurangzeb , khan Baharum, Baharudin Khairullah, khan |
author_sort |
Aurangzeb , khan |
title |
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
title_short |
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
title_full |
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
title_fullStr |
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
title_full_unstemmed |
Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification |
title_sort |
efficient feature selection and domain relevance term weighting method for document classification |
publishDate |
2010 |
url |
http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf http://eprints.utp.edu.my/6431/ |
_version_ |
1738655486992973824 |
score |
13.244368 |