A hybrid deep learning-based unsupervised anomaly detection in high dimensional data

Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems. Many issues in this field still unsolved, so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data....

Full description

Saved in:
Bibliographic Details
Main Authors: Muneer, A., Taib, S.M., Fati, S.M., Balogun, A.O., Aziz, I.A.
Format: Article
Published: Tech Science Press 2022
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85117009228&doi=10.32604%2fcmc.2022.020732&partnerID=40&md5=0a89f42575a6dd3c06e57dbb05782837
http://eprints.utp.edu.my/28890/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.28890
record_format eprints
spelling my.utp.eprints.288902022-03-17T02:22:32Z A hybrid deep learning-based unsupervised anomaly detection in high dimensional data Muneer, A. Taib, S.M. Fati, S.M. Balogun, A.O. Aziz, I.A. Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems. Many issues in this field still unsolved, so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data. Such a phenomenon is referred to as the �curse of dimensionality� that affects traditional techniques in terms of both accuracy and performance. Thus, this research proposed a hybrid model based on Deep Autoencoder Neural Network (DANN) with five layers to reduce the difference between the input and output. The proposed model was applied to a real-world gas turbine (GT) dataset that contains 87620 columns and 56 rows. During the experiment, two issues have been investigated and solved to enhance the results. The first is the dataset class imbalance, which solved using SMOTE technique. The second issue is the poor performance, which can be solved using one of the optimization algorithms. Several optimization algorithms have been investigated and tested, including stochastic gradient descent (SGD), RMSprop, Adam and Adamax. However, Adamax optimization algorithm showed the best results when employed to train the DANN model. The experimental results show that our proposed model can detect the anomalies by efficiently reducing the high dimensionality of dataset with accuracy of 99.40, F1-score of 0.9649, Area Under the Curve (AUC) rate of 0.9649, and a minimal loss function during the hybrid model training. © 2022 Tech Science Press. All rights reserved. Tech Science Press 2022 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85117009228&doi=10.32604%2fcmc.2022.020732&partnerID=40&md5=0a89f42575a6dd3c06e57dbb05782837 Muneer, A. and Taib, S.M. and Fati, S.M. and Balogun, A.O. and Aziz, I.A. (2022) A hybrid deep learning-based unsupervised anomaly detection in high dimensional data. Computers, Materials and Continua, 70 (3). pp. 6073-6088. http://eprints.utp.edu.my/28890/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Anomaly detection in high dimensional data is a critical research issue with serious implication in the real-world problems. Many issues in this field still unsolved, so several modern anomaly detection methods struggle to maintain adequate accuracy due to the highly descriptive nature of big data. Such a phenomenon is referred to as the �curse of dimensionality� that affects traditional techniques in terms of both accuracy and performance. Thus, this research proposed a hybrid model based on Deep Autoencoder Neural Network (DANN) with five layers to reduce the difference between the input and output. The proposed model was applied to a real-world gas turbine (GT) dataset that contains 87620 columns and 56 rows. During the experiment, two issues have been investigated and solved to enhance the results. The first is the dataset class imbalance, which solved using SMOTE technique. The second issue is the poor performance, which can be solved using one of the optimization algorithms. Several optimization algorithms have been investigated and tested, including stochastic gradient descent (SGD), RMSprop, Adam and Adamax. However, Adamax optimization algorithm showed the best results when employed to train the DANN model. The experimental results show that our proposed model can detect the anomalies by efficiently reducing the high dimensionality of dataset with accuracy of 99.40, F1-score of 0.9649, Area Under the Curve (AUC) rate of 0.9649, and a minimal loss function during the hybrid model training. © 2022 Tech Science Press. All rights reserved.
format Article
author Muneer, A.
Taib, S.M.
Fati, S.M.
Balogun, A.O.
Aziz, I.A.
spellingShingle Muneer, A.
Taib, S.M.
Fati, S.M.
Balogun, A.O.
Aziz, I.A.
A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
author_facet Muneer, A.
Taib, S.M.
Fati, S.M.
Balogun, A.O.
Aziz, I.A.
author_sort Muneer, A.
title A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
title_short A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
title_full A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
title_fullStr A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
title_full_unstemmed A hybrid deep learning-based unsupervised anomaly detection in high dimensional data
title_sort hybrid deep learning-based unsupervised anomaly detection in high dimensional data
publisher Tech Science Press
publishDate 2022
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85117009228&doi=10.32604%2fcmc.2022.020732&partnerID=40&md5=0a89f42575a6dd3c06e57dbb05782837
http://eprints.utp.edu.my/28890/
_version_ 1738656897395851264
score 13.211869