Development of missing data prediction model for carbon monoxide
Carbon monoxide (CO) is one of the most important pollutants since it is selected for API calculation. Therefore, it is paramount to ensure that there is no missing data of CO during the analysis. There are numbers of occurrences that may contribute to the missing data problems such as inability o...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English English |
Published: |
Penerbit UTM Press
2019
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/70673/1/70673_Development%20of%20missing%20data%20prediction.pdf http://irep.iium.edu.my/70673/2/70673_Development%20of%20missing%20data%20prediction_WOS.pdf http://irep.iium.edu.my/70673/ https://mjfas.utm.my/index.php/mjfas/article/view/969/pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Carbon monoxide (CO) is one of the most important pollutants since it is selected for API calculation.
Therefore, it is paramount to ensure that there is no missing data of CO during the analysis. There are
numbers of occurrences that may contribute to the missing data problems such as inability of the
instrument to record certain parameters. In view of this fact, a CO prediction model needs to be
developed to address this problem. A dataset of meteorological and air pollutants value was obtained
from the Air Quality Division, Department of Environment Malaysia (DOE). A total of 113112 datasets
were used to develop the model using sensitivity analysis (SA) through artificial neural network (ANN).
SA showed particulate matter (PM10) and ozone (O3) were the most significant input variables for
missing data prediction model of CO. Three hidden nodes were the optimum number to develop the
ANN model with the value of R2 equal to 0.5311. Both models (artificial neural network-carbon
monoxide-all parameters (ANN-CO-AP) and artificial neural network-carbon monoxide-leave out
(ANN-CO-LO)) showed high value of R2 (0.7639 and 0.5311) and low value of RMSE (0.2482 and
0.3506), respectively. These values indicated that the models might only employ the most significant
input variables to represent the CO rather than using all input variables. |
---|