A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
Stacked ensemble formulates an ensemble using a meta-learner to combine (stack) the predictions of multiple base classifiers. It suffers from the problem of suboptimal performance in imbalanced classification. Several underlying difficulty factors are reported to be responsible for performance de...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2021
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf http://studentsrepo.um.edu.my/14776/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.um.stud.14776 |
---|---|
record_format |
eprints |
spelling |
my.um.stud.147762024-02-17T17:45:18Z A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian Seng , Zian QA75 Electronic computers. Computer science T Technology (General) Stacked ensemble formulates an ensemble using a meta-learner to combine (stack) the predictions of multiple base classifiers. It suffers from the problem of suboptimal performance in imbalanced classification. Several underlying difficulty factors are reported to be responsible for performance degradation in imbalanced classification. This research aims to improve the classification performance of the stacked ensemble on imbalanced datasets by investigating the stacked ensemble’s meta-learner and the underlying difficulty factors (i.e., class imbalance, class overlapping, and class noise). Since the stacked ensemble’s imbalanced classification performance depends on the configuration of its meta-learner, an experiment (i.e., Experiment 1) was conducted to identify the best performing type of meta-learner. The results of Experiment 1 showed that the weighted combination-based meta-learner outperformed other types of meta-learners. Also, based on Experiment 1’s result, the ‘AUC-maximising meta-learner’ is one of the best performing weighted combination-based meta-learners. Inspired by the superior performance of the AUC-maximising meta-learner (in Experiment 1) and the importance of H-measure (in the literature), a new weighted combination-based meta-learner that maximises the H-measure (i.e., H-measure maximising meta-learner) was further proposed. Experiment 2 was conducted to evaluate the proposed H-measure maximising meta-learner. Then, it was benchmarked with the top 3 meta-learners in Experiment 1 and superior classification performance of the proposed meta-learner was observed. Then, this research further investigated the stacked ensemble’s degradation problem from the perspective of underlying difficulty factors in imbalanced datasets. A stacked ensemble coined as Neighbourhood Undersampling Stacked Ensemble (NUS-SE) was proposed. The NUS-SE consists of two proposed components, i.e., the US-SE framework and the Neighbourhood Undersampling. Experiment 3 was performed to evaluate the performance of the proposed NUS-SE. Since NUS-SE is integrable with any meta-learner, the top 3 meta-learners in Experiment 1 and the proposed H-measure maximising meta-learner were used as the meta-learners of NUS-SE in Experiment 3. Based on Experiment 3’ results, the NUS-SE with Hmeasure maximising meta-learner (NUS-SE-H) outperformed all the original unmodified stacked ensembles with different meta-learners and the proposed NUS-SE with other top-performing meta-learners (i.e., NUS-SE-AUC, NUS-SE-CCLL, NUS-SE-NNLS). 2021-08 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf application/pdf http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf Seng , Zian (2021) A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian. PhD thesis, Universiti Malaya. http://studentsrepo.um.edu.my/14776/ |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Student Repository |
url_provider |
http://studentsrepo.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science T Technology (General) |
spellingShingle |
QA75 Electronic computers. Computer science T Technology (General) Seng , Zian A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian |
description |
Stacked ensemble formulates an ensemble using a meta-learner to combine (stack)
the predictions of multiple base classifiers. It suffers from the problem of suboptimal
performance in imbalanced classification. Several underlying difficulty factors are reported
to be responsible for performance degradation in imbalanced classification. This research
aims to improve the classification performance of the stacked ensemble on imbalanced
datasets by investigating the stacked ensemble’s meta-learner and the underlying difficulty
factors (i.e., class imbalance, class overlapping, and class noise). Since the stacked
ensemble’s imbalanced classification performance depends on the configuration of its
meta-learner, an experiment (i.e., Experiment 1) was conducted to identify the best
performing type of meta-learner. The results of Experiment 1 showed that the weighted
combination-based meta-learner outperformed other types of meta-learners. Also, based
on Experiment 1’s result, the ‘AUC-maximising meta-learner’ is one of the best performing
weighted combination-based meta-learners. Inspired by the superior performance of the
AUC-maximising meta-learner (in Experiment 1) and the importance of H-measure (in the
literature), a new weighted combination-based meta-learner that maximises the H-measure
(i.e., H-measure maximising meta-learner) was further proposed. Experiment 2 was
conducted to evaluate the proposed H-measure maximising meta-learner. Then, it was
benchmarked with the top 3 meta-learners in Experiment 1 and superior classification
performance of the proposed meta-learner was observed. Then, this research further
investigated the stacked ensemble’s degradation problem from the perspective of underlying difficulty factors in imbalanced datasets. A stacked ensemble coined as Neighbourhood
Undersampling Stacked Ensemble (NUS-SE) was proposed. The NUS-SE consists of two
proposed components, i.e., the US-SE framework and the Neighbourhood Undersampling.
Experiment 3 was performed to evaluate the performance of the proposed NUS-SE. Since
NUS-SE is integrable with any meta-learner, the top 3 meta-learners in Experiment 1
and the proposed H-measure maximising meta-learner were used as the meta-learners
of NUS-SE in Experiment 3. Based on Experiment 3’ results, the NUS-SE with Hmeasure
maximising meta-learner (NUS-SE-H) outperformed all the original unmodified
stacked ensembles with different meta-learners and the proposed NUS-SE with other
top-performing meta-learners (i.e., NUS-SE-AUC, NUS-SE-CCLL, NUS-SE-NNLS).
|
format |
Thesis |
author |
Seng , Zian |
author_facet |
Seng , Zian |
author_sort |
Seng , Zian |
title |
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
|
title_short |
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
|
title_full |
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
|
title_fullStr |
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
|
title_full_unstemmed |
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
|
title_sort |
neighbourhood undersampling stacked ensemble with h-measure maximising meta-learner for imbalanced classification / seng zian |
publishDate |
2021 |
url |
http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf http://studentsrepo.um.edu.my/14776/ |
_version_ |
1792149226132602880 |
score |
13.211869 |