Clustering for binary data sets by using genetic algorithm-incremental K-means

This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this r...

Full description

Saved in:
Bibliographic Details
Main Authors: Saharan, S., Baragona, R., Nor, M. E., Salleh, R. M., Asrah, N. M.
Format: Article
Language:en
Published: IOP Publishing 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5691/1/AJ%202018%20%28308%29.pdf
http://eprints.uthm.edu.my/5691/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1833417915414609920
author Saharan, S.
Baragona, R.
Nor, M. E.
Salleh, R. M.
Asrah, N. M.
author_facet Saharan, S.
Baragona, R.
Nor, M. E.
Salleh, R. M.
Asrah, N. M.
author_sort Saharan, S.
building UTHM Library
collection Institutional Repository
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
continent Asia
country Malaysia
description This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental Kmeans (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers.
format Article
id my.uthm.eprints-5691
institution Universiti Tun Hussein Onn Malaysia
language en
publishDate 2018
publisher IOP Publishing
record_format eprints
spelling my.uthm.eprints-56912022-01-20T06:36:56Z http://eprints.uthm.edu.my/5691/ Clustering for binary data sets by using genetic algorithm-incremental K-means Saharan, S. Baragona, R. Nor, M. E. Salleh, R. M. Asrah, N. M. QA273-280 Probabilities. Mathematical statistics This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental Kmeans (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers. IOP Publishing 2018 Article PeerReviewed text en http://eprints.uthm.edu.my/5691/1/AJ%202018%20%28308%29.pdf Saharan, S. and Baragona, R. and Nor, M. E. and Salleh, R. M. and Asrah, N. M. (2018) Clustering for binary data sets by using genetic algorithm-incremental K-means. Journal of Physics: Conference Series, 995. pp. 1-5. ISSN 1742-6588
spellingShingle QA273-280 Probabilities. Mathematical statistics
Saharan, S.
Baragona, R.
Nor, M. E.
Salleh, R. M.
Asrah, N. M.
Clustering for binary data sets by using genetic algorithm-incremental K-means
title Clustering for binary data sets by using genetic algorithm-incremental K-means
title_full Clustering for binary data sets by using genetic algorithm-incremental K-means
title_fullStr Clustering for binary data sets by using genetic algorithm-incremental K-means
title_full_unstemmed Clustering for binary data sets by using genetic algorithm-incremental K-means
title_short Clustering for binary data sets by using genetic algorithm-incremental K-means
title_sort clustering for binary data sets by using genetic algorithm-incremental k-means
topic QA273-280 Probabilities. Mathematical statistics
url http://eprints.uthm.edu.my/5691/1/AJ%202018%20%28308%29.pdf
http://eprints.uthm.edu.my/5691/
url_provider http://eprints.uthm.edu.my/