Imputation methods on daily PM10 data (2010-15)

Air pollution monitoring especially PM10 pollutant is very important since the air pollutant data originated from the continuous ambient air quality stations (CAAQS) usually had missing data due to the machine failure, routine maintenance and human error. In view of this fact, a study of PM10 imputa...

Full description

Saved in:
Bibliographic Details
Main Authors: Abd Rani, Nurul Latiffah, Azid, Azman, Yunus, Kamaruzzaman
Format: Article
Language:English
Published: Innovative Scientific Information & Services Network 2019
Subjects:
Online Access:http://irep.iium.edu.my/76209/1/Prof%20K-2.pdf
http://irep.iium.edu.my/76209/
https://www.isisn.org/BR16(SI-1)2019/306-310-16(SI)2019BR19-SI-05.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.iium.irep.76209
record_format dspace
spelling my.iium.irep.762092020-01-09T04:58:49Z http://irep.iium.edu.my/76209/ Imputation methods on daily PM10 data (2010-15) Abd Rani, Nurul Latiffah Azid, Azman Yunus, Kamaruzzaman QD Chemistry Air pollution monitoring especially PM10 pollutant is very important since the air pollutant data originated from the continuous ambient air quality stations (CAAQS) usually had missing data due to the machine failure, routine maintenance and human error. In view of this fact, a study of PM10 imputation method was performed with the objective to determine the coefficient of determination (R2) and root mean square error (RMSE) in order to portray the goodness of fit for all of the imputation methods used (mean substitution, nearest neighbour and expectation maximization based algorithm (EMB)). The results of R2 obtained for 5%, 10%, 15%, 25% and 40% proportion of missing data using nearest neighbor imputation methods are 0.9318, 0.8126, 0.6546, 0.5458 and 0.3946, while RMSE are 7.47, 12.27, 16.68, 19.13 and 21.76, respectively. Meanwhile, results of R2 obtained for 5%, 10%, 15%, 25% and 40% proportion of missing data using mean imputation methods are 0.9274, 0.8117, 0.6484, 0.5400 and 0.3910, while RMSE are 7.47, 12.36, 16.90, 19.13 and 22.07, respectively. In the meantime, the results of R2 for EMB imputation method applied at 5%, 10%, 15%, 25% and 40% proportion of missing data are 0.9084, 0.8468, 0.7530, 0.5791 and 0.5004, while RMSE are 8.58, 11.18, 14.20, 18.53 and 20.48, respectively. A measure of performances (R2 and RMSE) for each imputation methods decreased and increase respectively as the percentages of simulated missing data increases Innovative Scientific Information & Services Network 2019-11 Article PeerReviewed application/pdf en http://irep.iium.edu.my/76209/1/Prof%20K-2.pdf Abd Rani, Nurul Latiffah and Azid, Azman and Yunus, Kamaruzzaman (2019) Imputation methods on daily PM10 data (2010-15). Bioscience Research, 16 (S1). pp. 306-310. ISSN 1811-9506 E-ISSN 2218-3973 https://www.isisn.org/BR16(SI-1)2019/306-310-16(SI)2019BR19-SI-05.pdf
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
topic QD Chemistry
spellingShingle QD Chemistry
Abd Rani, Nurul Latiffah
Azid, Azman
Yunus, Kamaruzzaman
Imputation methods on daily PM10 data (2010-15)
description Air pollution monitoring especially PM10 pollutant is very important since the air pollutant data originated from the continuous ambient air quality stations (CAAQS) usually had missing data due to the machine failure, routine maintenance and human error. In view of this fact, a study of PM10 imputation method was performed with the objective to determine the coefficient of determination (R2) and root mean square error (RMSE) in order to portray the goodness of fit for all of the imputation methods used (mean substitution, nearest neighbour and expectation maximization based algorithm (EMB)). The results of R2 obtained for 5%, 10%, 15%, 25% and 40% proportion of missing data using nearest neighbor imputation methods are 0.9318, 0.8126, 0.6546, 0.5458 and 0.3946, while RMSE are 7.47, 12.27, 16.68, 19.13 and 21.76, respectively. Meanwhile, results of R2 obtained for 5%, 10%, 15%, 25% and 40% proportion of missing data using mean imputation methods are 0.9274, 0.8117, 0.6484, 0.5400 and 0.3910, while RMSE are 7.47, 12.36, 16.90, 19.13 and 22.07, respectively. In the meantime, the results of R2 for EMB imputation method applied at 5%, 10%, 15%, 25% and 40% proportion of missing data are 0.9084, 0.8468, 0.7530, 0.5791 and 0.5004, while RMSE are 8.58, 11.18, 14.20, 18.53 and 20.48, respectively. A measure of performances (R2 and RMSE) for each imputation methods decreased and increase respectively as the percentages of simulated missing data increases
format Article
author Abd Rani, Nurul Latiffah
Azid, Azman
Yunus, Kamaruzzaman
author_facet Abd Rani, Nurul Latiffah
Azid, Azman
Yunus, Kamaruzzaman
author_sort Abd Rani, Nurul Latiffah
title Imputation methods on daily PM10 data (2010-15)
title_short Imputation methods on daily PM10 data (2010-15)
title_full Imputation methods on daily PM10 data (2010-15)
title_fullStr Imputation methods on daily PM10 data (2010-15)
title_full_unstemmed Imputation methods on daily PM10 data (2010-15)
title_sort imputation methods on daily pm10 data (2010-15)
publisher Innovative Scientific Information & Services Network
publishDate 2019
url http://irep.iium.edu.my/76209/1/Prof%20K-2.pdf
http://irep.iium.edu.my/76209/
https://www.isisn.org/BR16(SI-1)2019/306-310-16(SI)2019BR19-SI-05.pdf
_version_ 1657565933005176832
score 13.211869