Neural networks for web news classification based on augmented PCA

In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profilebased features (CPBF). Each news web page is represented by the term-weighting scheme. As the number of unique words in the co...

Full description

Saved in:
Bibliographic Details
Main Authors: Selamat, Ali, Omatu, Sigeru
Format: Conference or Workshop Item
Language:en
Published: 2003
Subjects:
Online Access:http://eprints.utm.my/3123/1/IJCNN2003.pdf
http://eprints.utm.my/3123/
http://ieeexplore.ieee.org/iel5/8672/27485/01223679.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1845470834089525248
author Selamat, Ali
Omatu, Sigeru
author_facet Selamat, Ali
Omatu, Sigeru
author_sort Selamat, Ali
building UTM Library
collection Institutional Repository
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
continent Asia
country Malaysia
description In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profilebased features (CPBF). Each news web page is represented by the term-weighting scheme. As the number of unique words in the collection set is big, the principal component analysis (PCA) has been used to select the most relevant features for the classification. Then the final output of the PCA is augmented with the feature vectors from the class-profile which contains the most regular words in each class before feeding them to the neural networks. We have manually selected the most regular words that exist in each class and weighted them using an entropy weighting scheme. The fixed number of regular words from each class will be used as a feature vectors together with the reduced principal components from the PCA. These feature vectors are then used as the input to the neural networks for classification. The experimental evaluation demonstrates that the WPCM method provides acceptable classification accuracy with the sports news datasets.
format Conference or Workshop Item
id my.utm.eprints-3123
institution Universiti Teknologi Malaysia
language en
publishDate 2003
record_format eprints
spelling my.utm.eprints-31232011-05-10T05:28:03Z http://eprints.utm.my/3123/ Neural networks for web news classification based on augmented PCA Selamat, Ali Omatu, Sigeru QA76 Computer software In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profilebased features (CPBF). Each news web page is represented by the term-weighting scheme. As the number of unique words in the collection set is big, the principal component analysis (PCA) has been used to select the most relevant features for the classification. Then the final output of the PCA is augmented with the feature vectors from the class-profile which contains the most regular words in each class before feeding them to the neural networks. We have manually selected the most regular words that exist in each class and weighted them using an entropy weighting scheme. The fixed number of regular words from each class will be used as a feature vectors together with the reduced principal components from the PCA. These feature vectors are then used as the input to the neural networks for classification. The experimental evaluation demonstrates that the WPCM method provides acceptable classification accuracy with the sports news datasets. 2003 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/3123/1/IJCNN2003.pdf Selamat, Ali and Omatu, Sigeru (2003) Neural networks for web news classification based on augmented PCA. In: International Joint Conference on Neural Networks (IJCNN 2003), July 20-24, 2003, Portland, Oregon, USA. http://ieeexplore.ieee.org/iel5/8672/27485/01223679.pdf
spellingShingle QA76 Computer software
Selamat, Ali
Omatu, Sigeru
Neural networks for web news classification based on augmented PCA
title Neural networks for web news classification based on augmented PCA
title_full Neural networks for web news classification based on augmented PCA
title_fullStr Neural networks for web news classification based on augmented PCA
title_full_unstemmed Neural networks for web news classification based on augmented PCA
title_short Neural networks for web news classification based on augmented PCA
title_sort neural networks for web news classification based on augmented pca
topic QA76 Computer software
url http://eprints.utm.my/3123/1/IJCNN2003.pdf
http://eprints.utm.my/3123/
http://ieeexplore.ieee.org/iel5/8672/27485/01223679.pdf
url_provider http://eprints.utm.my/