Improved normalization and standardization techniques for higher purity in K-means clustering

Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects be...

Full description

Saved in:
Bibliographic Details
Main Authors: Dalatu, Paul Inuwa, Fitrianto, Anwar, Mustapha, Aida
Format: Article
Language:English
Published: Pushpa Publishing House 2016
Online Access:http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf
http://psasir.upm.edu.my/id/eprint/54519/
http://www.pphmj.com/abstract/10134.htm
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.54519
record_format eprints
spelling my.upm.eprints.545192018-03-27T01:36:34Z http://psasir.upm.edu.my/id/eprint/54519/ Improved normalization and standardization techniques for higher purity in K-means clustering Dalatu, Paul Inuwa Fitrianto, Anwar Mustapha, Aida Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects belong to different clusters vary significantly, with respect to their attributes. The K-means algorithm is a famous and fast technique in non-hierarchical cluster algorithms. Based on its simplicity, the K-means algorithm has been used in many fields. This paper proposes improved normalization and standardization techniques for higher purity in K-means clustering experimented with benchmark datasets from UCI machine learning repository and it was found that all the proposed techniques’ performance was much higher compared to the conventional K-means and the three classic transformations, and it is evidently shown by purity and Rand index accuracy results. Pushpa Publishing House 2016-09 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf Dalatu, Paul Inuwa and Fitrianto, Anwar and Mustapha, Aida (2016) Improved normalization and standardization techniques for higher purity in K-means clustering. Far East Journal of Mathematical Sciences, 100 (6). pp. 859-871. ISSN 0972-0871 http://www.pphmj.com/abstract/10134.htm 10.17654/MS100060859
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects belong to different clusters vary significantly, with respect to their attributes. The K-means algorithm is a famous and fast technique in non-hierarchical cluster algorithms. Based on its simplicity, the K-means algorithm has been used in many fields. This paper proposes improved normalization and standardization techniques for higher purity in K-means clustering experimented with benchmark datasets from UCI machine learning repository and it was found that all the proposed techniques’ performance was much higher compared to the conventional K-means and the three classic transformations, and it is evidently shown by purity and Rand index accuracy results.
format Article
author Dalatu, Paul Inuwa
Fitrianto, Anwar
Mustapha, Aida
spellingShingle Dalatu, Paul Inuwa
Fitrianto, Anwar
Mustapha, Aida
Improved normalization and standardization techniques for higher purity in K-means clustering
author_facet Dalatu, Paul Inuwa
Fitrianto, Anwar
Mustapha, Aida
author_sort Dalatu, Paul Inuwa
title Improved normalization and standardization techniques for higher purity in K-means clustering
title_short Improved normalization and standardization techniques for higher purity in K-means clustering
title_full Improved normalization and standardization techniques for higher purity in K-means clustering
title_fullStr Improved normalization and standardization techniques for higher purity in K-means clustering
title_full_unstemmed Improved normalization and standardization techniques for higher purity in K-means clustering
title_sort improved normalization and standardization techniques for higher purity in k-means clustering
publisher Pushpa Publishing House
publishDate 2016
url http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf
http://psasir.upm.edu.my/id/eprint/54519/
http://www.pphmj.com/abstract/10134.htm
_version_ 1643835654269829120
score 13.211869