Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data

Addressing data anomalies (e.g., garbage data, outliers, redundant data, and missing data) plays a vital role in performing accurate analytics (billing, forecasting, load profiling, etc.) on smart homes� energy consumption data. From the literature, it has been identified that the data imputation...

Full description

Saved in:
Bibliographic Details
Main Authors: Kasaraneni, P.P., Venkata Pavan Kumar, Y., Moganti, G.L.K., Kannan, R.
Format: Article
Published: 2022
Online Access:http://scholars.utp.edu.my/id/eprint/34027/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85143847225&doi=10.3390%2fs22239323&partnerID=40&md5=d5ddaea488606fc2c5cf16c497ffac7d
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:scholars.utp.edu.my:34027
record_format eprints
spelling oai:scholars.utp.edu.my:340272022-12-28T07:53:44Z http://scholars.utp.edu.my/id/eprint/34027/ Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data Kasaraneni, P.P. Venkata Pavan Kumar, Y. Moganti, G.L.K. Kannan, R. Addressing data anomalies (e.g., garbage data, outliers, redundant data, and missing data) plays a vital role in performing accurate analytics (billing, forecasting, load profiling, etc.) on smart homes� energy consumption data. From the literature, it has been identified that the data imputation with machine learning (ML)-based single-classifier approaches are used to address data quality issues. However, these approaches are not effective to address the hidden issues of smart home energy consumption data due to the presence of a variety of anomalies. Hence, this paper proposes ML-based ensemble classifiers using random forest (RF), support vector machine (SVM), decision tree (DT), naive Bayes, K-nearest neighbor, and neural networks to handle all the possible anomalies in smart home energy consumption data. The proposed approach initially identifies all anomalies and removes them, and then imputes this removed/missing information. The entire implementation consists of four parts. Part 1 presents anomaly detection and removal, part 2 presents data imputation, part 3 presents single-classifier approaches, and part 4 presents ensemble classifiers approaches. To assess the classifiers� performance, various metrics, namely, accuracy, precision, recall/sensitivity, specificity, and F1 score are computed. From these metrics, it is identified that the ensemble classifier �RF+SVM+DT� has shown superior performance over the conventional single classifiers as well the other ensemble classifiers for anomaly handling. © 2022 by the authors. 2022 Article NonPeerReviewed Kasaraneni, P.P. and Venkata Pavan Kumar, Y. and Moganti, G.L.K. and Kannan, R. (2022) Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data. Sensors, 22 (23). https://www.scopus.com/inward/record.uri?eid=2-s2.0-85143847225&doi=10.3390%2fs22239323&partnerID=40&md5=d5ddaea488606fc2c5cf16c497ffac7d 10.3390/s22239323 10.3390/s22239323 10.3390/s22239323
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Addressing data anomalies (e.g., garbage data, outliers, redundant data, and missing data) plays a vital role in performing accurate analytics (billing, forecasting, load profiling, etc.) on smart homes� energy consumption data. From the literature, it has been identified that the data imputation with machine learning (ML)-based single-classifier approaches are used to address data quality issues. However, these approaches are not effective to address the hidden issues of smart home energy consumption data due to the presence of a variety of anomalies. Hence, this paper proposes ML-based ensemble classifiers using random forest (RF), support vector machine (SVM), decision tree (DT), naive Bayes, K-nearest neighbor, and neural networks to handle all the possible anomalies in smart home energy consumption data. The proposed approach initially identifies all anomalies and removes them, and then imputes this removed/missing information. The entire implementation consists of four parts. Part 1 presents anomaly detection and removal, part 2 presents data imputation, part 3 presents single-classifier approaches, and part 4 presents ensemble classifiers approaches. To assess the classifiers� performance, various metrics, namely, accuracy, precision, recall/sensitivity, specificity, and F1 score are computed. From these metrics, it is identified that the ensemble classifier �RF+SVM+DT� has shown superior performance over the conventional single classifiers as well the other ensemble classifiers for anomaly handling. © 2022 by the authors.
format Article
author Kasaraneni, P.P.
Venkata Pavan Kumar, Y.
Moganti, G.L.K.
Kannan, R.
spellingShingle Kasaraneni, P.P.
Venkata Pavan Kumar, Y.
Moganti, G.L.K.
Kannan, R.
Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
author_facet Kasaraneni, P.P.
Venkata Pavan Kumar, Y.
Moganti, G.L.K.
Kannan, R.
author_sort Kasaraneni, P.P.
title Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
title_short Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
title_full Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
title_fullStr Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
title_full_unstemmed Machine Learning-Based Ensemble Classifiers for Anomaly Handling in Smart Home Energy Consumption Data
title_sort machine learning-based ensemble classifiers for anomaly handling in smart home energy consumption data
publishDate 2022
url http://scholars.utp.edu.my/id/eprint/34027/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85143847225&doi=10.3390%2fs22239323&partnerID=40&md5=d5ddaea488606fc2c5cf16c497ffac7d
_version_ 1753790785898151936
score 13.223943