Classification prediction of PM10 concentration using a tree-based machine learning approach

The PM10 prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM10 concentrations accurately. Therefore, in this study, three machine learning algorithms—namely, decision tree (DT), bo...

Full description

Saved in:
Bibliographic Details
Main Authors: Wan Nur Shaziayani, Ul-Saufie, Ahmad Zia, Mutalib, Sofianita, Mohamad Noor, Norazian, Zainordin, Nazatul Syadia
Format: Article
Published: MDPI 2022
Online Access:http://psasir.upm.edu.my/id/eprint/100704/
https://www.mdpi.com/2073-4433/13/4/538
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.100704
record_format eprints
spelling my.upm.eprints.1007042023-09-15T07:56:07Z http://psasir.upm.edu.my/id/eprint/100704/ Classification prediction of PM10 concentration using a tree-based machine learning approach Wan Nur Shaziayani Ul-Saufie, Ahmad Zia Mutalib, Sofianita Mohamad Noor, Norazian Zainordin, Nazatul Syadia The PM10 prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM10 concentrations accurately. Therefore, in this study, three machine learning algorithms—namely, decision tree (DT), boosted regression tree (BRT), and random forest (RF)—were applied for the prediction of PM10 in Kota Bharu, Kelantan. The results from these three methods were compared to find the best method to predict PM10 concentration for the next day by using the maximum daily data from January 2002 to December 2017. To this end, 80% of the data were used for training and 20% for validation of the models. The performance measure of the PM10 concentration was based on accuracy, sensitivity, specificity, and precision for RF, BRT, and DT, respectively, which indicates that these three models were developed effectively, and they are applicable in the prediction of other atmospheric environmental data. The best model to use in predicting the next day’s PM10 concentration classification was the random forest classifier, with an accuracy of 98.37, sensitivity of 97.19, specificity of 99.55, and precision of 99.54, but the result of the boosted regression tree was substantially different from the RF model, with an accuracy of 98.12, sensitivity of 97.51, specificity of 98.72, and precision of 98.71. The best model can assist local governments in providing early warnings to people who are at risk of acute and chronic health consequences from air pollution. MDPI 2022-03-29 Article PeerReviewed Wan Nur Shaziayani and Ul-Saufie, Ahmad Zia and Mutalib, Sofianita and Mohamad Noor, Norazian and Zainordin, Nazatul Syadia (2022) Classification prediction of PM10 concentration using a tree-based machine learning approach. Atmosphere, 13 (4). art. no. 538. pp. 1-11. ISSN 2073-4433 https://www.mdpi.com/2073-4433/13/4/538 10.3390/atmos13040538
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
description The PM10 prediction has received considerable attention due to its harmful effects on human health. Machine learning approaches have the potential to predict and classify future PM10 concentrations accurately. Therefore, in this study, three machine learning algorithms—namely, decision tree (DT), boosted regression tree (BRT), and random forest (RF)—were applied for the prediction of PM10 in Kota Bharu, Kelantan. The results from these three methods were compared to find the best method to predict PM10 concentration for the next day by using the maximum daily data from January 2002 to December 2017. To this end, 80% of the data were used for training and 20% for validation of the models. The performance measure of the PM10 concentration was based on accuracy, sensitivity, specificity, and precision for RF, BRT, and DT, respectively, which indicates that these three models were developed effectively, and they are applicable in the prediction of other atmospheric environmental data. The best model to use in predicting the next day’s PM10 concentration classification was the random forest classifier, with an accuracy of 98.37, sensitivity of 97.19, specificity of 99.55, and precision of 99.54, but the result of the boosted regression tree was substantially different from the RF model, with an accuracy of 98.12, sensitivity of 97.51, specificity of 98.72, and precision of 98.71. The best model can assist local governments in providing early warnings to people who are at risk of acute and chronic health consequences from air pollution.
format Article
author Wan Nur Shaziayani
Ul-Saufie, Ahmad Zia
Mutalib, Sofianita
Mohamad Noor, Norazian
Zainordin, Nazatul Syadia
spellingShingle Wan Nur Shaziayani
Ul-Saufie, Ahmad Zia
Mutalib, Sofianita
Mohamad Noor, Norazian
Zainordin, Nazatul Syadia
Classification prediction of PM10 concentration using a tree-based machine learning approach
author_facet Wan Nur Shaziayani
Ul-Saufie, Ahmad Zia
Mutalib, Sofianita
Mohamad Noor, Norazian
Zainordin, Nazatul Syadia
author_sort Wan Nur Shaziayani
title Classification prediction of PM10 concentration using a tree-based machine learning approach
title_short Classification prediction of PM10 concentration using a tree-based machine learning approach
title_full Classification prediction of PM10 concentration using a tree-based machine learning approach
title_fullStr Classification prediction of PM10 concentration using a tree-based machine learning approach
title_full_unstemmed Classification prediction of PM10 concentration using a tree-based machine learning approach
title_sort classification prediction of pm10 concentration using a tree-based machine learning approach
publisher MDPI
publishDate 2022
url http://psasir.upm.edu.my/id/eprint/100704/
https://www.mdpi.com/2073-4433/13/4/538
_version_ 1781706686026743808
score 13.211869