Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification

Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has...

詳細記述

保存先:
書誌詳細
主要な著者: Aurangzeb , khan, Baharum, Baharudin, Khairullah, khan
フォーマット: Conference or Workshop Item
出版事項: 2010
主題:
オンライン・アクセス:http://eprints.utp.edu.my/6431/1/Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf
http://eprints.utp.edu.my/6431/
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
その他の書誌記述
要約:Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the “Bag of Word” BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification.