Feature Selection and Ensemble Meta Classifier for Multiclass Imbalance Data Learning

The aim of this paper is to investigate the effects of combining feature selection and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning .This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large...

Full description

Saved in:
Bibliographic Details
Main Authors: Sainin, Mohd Shamrie, Alfred, Rayner, Alias, Suraya, Lammasha, Mohamed A.M.
Format: Conference or Workshop Item
Language:English
Published: 2018
Subjects:
Online Access:http://repo.uum.edu.my/25208/1/KMICE%202018%20134%20139.pdf
http://repo.uum.edu.my/25208/
http://www.kmice.cms.net.my/ProcKMICe/KMICe2018/toc.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The aim of this paper is to investigate the effects of combining feature selection and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning .This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large benchmark data sets in which six ensemble methods from Weka machine learning tool were selected to perform the classification task.These ensemble methods include the AdaboostM1, Bagging, Decorate, END, MultiboostAB, and Rotation Forest.In addition, five base classifiers were used; Naïve Bayes, SMO, J48, Random Forest, and Random Tree in order to examine the performance of the ensemble methods. There are two feature selection approaches implemented which are filter-based (CfsSubsetEval, ConsistencySubsetEval and FilteredSubsetEval) and wrapper-based (WrapperSubsetEval). The results obtained from the experiments show that although the performance accuracy is not much improved, however, with less number of attributes, the classifiers are able to achieve similar accuracy or slightly improved with less processing time.In knowledge management, the findings provide important insight of which algorithm is suitable for decision making when dealing with high dimensional and large data.