Comparison of missing rainfall data treatment analysis at Kenyir Lake
Rainfall is one of the frequent data used in weather-related studies. Sometimes the data have missing information that needs the treatment to make sure the data can be useful, complete and reliable. There are many methods in treating missing data suggested by previous studies. The best selected meth...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2021
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/31460/1/iscee2020.pdf http://umpir.ump.edu.my/id/eprint/31460/ https://iopscience.iop.org/article |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Rainfall is one of the frequent data used in weather-related studies. Sometimes the data have missing information that needs the treatment to make sure the data can be useful, complete and reliable. There are many methods in treating missing data suggested by previous studies. The best selected method to estimate missing rainfall data in different regions may vary depending on the rainfall pattern and spatial distribution. Therefore, this paper discussed and compared 3 different methods in missing data treatment. The selected methods are Expectation Maximization (EM), Inverse Distance Weighted (IDW) and Multiple Imputation (MI). After analysis, the best method is IDW based on root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (r) and percentage of error (% of error) values. The IDW method has RMSE, MAE values and the lowest % of error values. In addition, the r value of IDW method is highest compared to EM and MI method. MI method recorded the highest values of RMSE, MAE and % of error with the lowest r value that proved MI method is the least accurate method to use in missing data treatment. After all methods were implemented, it proved that the IDW method is the best way to treat missing data because the analysis shows monthly rainfall distribution for 4 treatment stations in line to 3 missing data stations compared to EM and MI methods. |
---|