A novel approach to data mining using simplified swarm optimization
Data mining has become an increasingly important approach to deal with the rapid growth of data collected and stored in databases. In data mining, data classification and feature selection are considered the two main factors that drive people when making decisions. However, existing traditional d...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/3067/1/24p%20NOORHANIZA%20WAHID.pdf http://eprints.uthm.edu.my/3067/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data mining has become an increasingly important approach to deal with the rapid
growth of data collected and stored in databases. In data mining, data classification
and feature selection are considered the two main factors that drive people when
making decisions. However, existing traditional data classification and feature
selection techniques used in data management are no longer enough for such massive
data. This deficiency has prompted the need for a new intelligent data mining
technique based on stochastic population-based optimization that could discover
useful information from data.
In this thesis, a novel Simplified Swarm Optimization (SSO) algorithm is proposed as
a rule-based classifier and for feature selection. SSO is a simplified Particle Swarm
Optimization (PSO) that has a self-organising ability to emerge in highly distributed
control problem space, and is flexible, robust and cost effective to solve complex
computing environments. The proposed SSO classifier has been implemented to
classify audio data. To the author’s knowledge, this is the first time that SSO and PSO
have been applied for audio classification.
Furthermore, two local search strategies, named Exchange Local Search (ELS) and
Weighted Local Search (WLS), have been proposed to improve SSO performance.
SSO-ELS has been implemented to classify the 13 benchmark datasets obtained from
the UCI repository database. Meanwhile, SSO-WLS has been implemented in
Anomaly-based Network Intrusion Detection System (A-NIDS). In A-NIDS, a novel
hybrid SSO-based Rough Set (SSORS) for feature selection has also been proposed.
The empirical analysis showed promising results with high classification accuracy
rate achieved by all proposed techniques over audio data, UCI data and KDDCup 99
datasets. Therefore, the proposed SSO rule-based classifier with local search
strategies has offered a new paradigm shift in solving complex problems in data
mining which may not be able to be solved by other benchmark classifiers. |
---|