Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy
Feature selection and classification are widely utilized for data analysis. Recently, considerable advancement has been achieved in semi-supervised multi-task feature selection algorithms, where they have exploited the shared information from multiple related tasks. However, these semi-supervised...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2019
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/10006/1/Ganesh_Krishmasamy.pdf http://studentsrepo.um.edu.my/10006/2/Ganesh_Krishnasamy_%2D_Thesis.pdf http://studentsrepo.um.edu.my/10006/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.um.stud.10006 |
---|---|
record_format |
eprints |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Student Repository |
url_provider |
http://studentsrepo.um.edu.my/ |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Ganesh , Krishnasamy Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
description |
Feature selection and classification are widely utilized for data analysis. Recently,
considerable advancement has been achieved in semi-supervised multi-task feature selection
algorithms, where they have exploited the shared information from multiple related tasks.
However, these semi-supervised multi-task selection feature algorithms are unable to
naturally handle the multi-view data since they are designed to deal with single-view
data. Existing studies have demonstrated that mining information enclosed in multiple
views can drastically enhance the performance of feature selection. As for classification,
researchers have used semi-supervised learning for extreme learning machine (ELM),
where they have exploited both the labeled and unlabeled data in order to boost the
learning performances. They have incorporated Laplacian regularization to determine the
geometry of the underlying manifold. However, Laplacian regularization lacks extrapolating
power and biases the solution towards a constant function. These drawbacks affect the
performances of Laplacian regularized semi-supervised ELMs when a few labeled data
is used. In the first part of the study, a novel mathematical framework is introduced
for multi-view Laplacian semi-supervised feature selection by mining the correlations
among multiple tasks. The proposed algorithm is capable of exploiting complementary
information from different feature views in each task while exploring the shared knowledge
between multiple related tasks in a joint framework when the labeled training data is sparse.
An efficient iterative algorithm is developed to optimize the objective function of the
proposed algorithm since it is non-smooth and difficult to solve. The proposed algorithm
is compared with the state-of-the-art feature selection algorithms using three different datasets. These datasets include consumer video dataset, 3D motion recognition dataset
and handwritten digits recognition dataset. In these experiments, all the training and
testing data are represented as feature vectors. By using the proposed algorithm, the sparse
coefficients are learned by exploiting the relationships among different multi-view features
and leveraging the knowledge from multiple related tasks. Then, the sparse coefficients
are applied to both the feature vectors of the training and testing data to select the most
representative features. The selected features are then fed into a linear support vector
machine (SVM) for classification. The experimental results show that the proposed feature
selection framework performed better when compared to other state-of-the-art feature
selection algorithms. In the second part of the study, a novel classification algorithm called
Hessian semi-supervised ELM (HSS-ELM) is proposed to enhance the semi-supervised
learning of ELM. Unlike the Laplacian regularization, the Hessian regularization favours
function whose values vary linearly along the geodesic distance and preserves the local
manifold structure well. It leads to good extrapolating power. Furthermore, HSS-ELM
maintains almost all the advantages of the traditional ELM such as the significant training
efficiency and straightforward implementation for multiclass classification problems. The
proposed algorithm is tested on publicly available datasets. These datasets include G50C,
COIL20 (B), COIL20, USPST(B) and USPST. The experimental results demonstrate that
the proposed algorithm is competitive compared to the state-of-the-art semi-supervised
learning algorithms in terms of accuracy. Additionally, HSS-ELM requires remarkably less
training time compared to semi-supervised SVMs/regularized least-squares algorithms.
|
format |
Thesis |
author |
Ganesh , Krishnasamy |
author_facet |
Ganesh , Krishnasamy |
author_sort |
Ganesh , Krishnasamy |
title |
Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
title_short |
Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
title_full |
Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
title_fullStr |
Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
title_full_unstemmed |
Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy |
title_sort |
semi-supervised learning for feature selection and classification of data / ganesh krishnasamy |
publishDate |
2019 |
url |
http://studentsrepo.um.edu.my/10006/1/Ganesh_Krishmasamy.pdf http://studentsrepo.um.edu.my/10006/2/Ganesh_Krishnasamy_%2D_Thesis.pdf http://studentsrepo.um.edu.my/10006/ |
_version_ |
1738506311741472768 |
spelling |
my.um.stud.100062020-02-05T17:21:51Z Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy Ganesh , Krishnasamy TK Electrical engineering. Electronics Nuclear engineering Feature selection and classification are widely utilized for data analysis. Recently, considerable advancement has been achieved in semi-supervised multi-task feature selection algorithms, where they have exploited the shared information from multiple related tasks. However, these semi-supervised multi-task selection feature algorithms are unable to naturally handle the multi-view data since they are designed to deal with single-view data. Existing studies have demonstrated that mining information enclosed in multiple views can drastically enhance the performance of feature selection. As for classification, researchers have used semi-supervised learning for extreme learning machine (ELM), where they have exploited both the labeled and unlabeled data in order to boost the learning performances. They have incorporated Laplacian regularization to determine the geometry of the underlying manifold. However, Laplacian regularization lacks extrapolating power and biases the solution towards a constant function. These drawbacks affect the performances of Laplacian regularized semi-supervised ELMs when a few labeled data is used. In the first part of the study, a novel mathematical framework is introduced for multi-view Laplacian semi-supervised feature selection by mining the correlations among multiple tasks. The proposed algorithm is capable of exploiting complementary information from different feature views in each task while exploring the shared knowledge between multiple related tasks in a joint framework when the labeled training data is sparse. An efficient iterative algorithm is developed to optimize the objective function of the proposed algorithm since it is non-smooth and difficult to solve. The proposed algorithm is compared with the state-of-the-art feature selection algorithms using three different datasets. These datasets include consumer video dataset, 3D motion recognition dataset and handwritten digits recognition dataset. In these experiments, all the training and testing data are represented as feature vectors. By using the proposed algorithm, the sparse coefficients are learned by exploiting the relationships among different multi-view features and leveraging the knowledge from multiple related tasks. Then, the sparse coefficients are applied to both the feature vectors of the training and testing data to select the most representative features. The selected features are then fed into a linear support vector machine (SVM) for classification. The experimental results show that the proposed feature selection framework performed better when compared to other state-of-the-art feature selection algorithms. In the second part of the study, a novel classification algorithm called Hessian semi-supervised ELM (HSS-ELM) is proposed to enhance the semi-supervised learning of ELM. Unlike the Laplacian regularization, the Hessian regularization favours function whose values vary linearly along the geodesic distance and preserves the local manifold structure well. It leads to good extrapolating power. Furthermore, HSS-ELM maintains almost all the advantages of the traditional ELM such as the significant training efficiency and straightforward implementation for multiclass classification problems. The proposed algorithm is tested on publicly available datasets. These datasets include G50C, COIL20 (B), COIL20, USPST(B) and USPST. The experimental results demonstrate that the proposed algorithm is competitive compared to the state-of-the-art semi-supervised learning algorithms in terms of accuracy. Additionally, HSS-ELM requires remarkably less training time compared to semi-supervised SVMs/regularized least-squares algorithms. 2019 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/10006/1/Ganesh_Krishmasamy.pdf application/pdf http://studentsrepo.um.edu.my/10006/2/Ganesh_Krishnasamy_%2D_Thesis.pdf Ganesh , Krishnasamy (2019) Semi-supervised learning for feature selection and classification of data / Ganesh Krishnasamy. PhD thesis, University of Malaya. http://studentsrepo.um.edu.my/10006/ |
score |
13.211869 |