Comparison of missing rainfall data treatment analysis at Kenyir Lake

Rainfall is one of the frequent data used in weather-related studies. Sometimes the data have missing information that needs the treatment to make sure the data can be useful, complete and reliable. There are many methods in treating missing data suggested by previous studies. The best selected meth...

Full description

Saved in:
Bibliographic Details
Main Authors: Azreen Harina, Azman, Nurul Nadrah Aqilah, Tukimat, M A, Male
Format: Article
Language:English
Published: IOP Publishing 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/31460/1/iscee2020.pdf
http://umpir.ump.edu.my/id/eprint/31460/
https://iopscience.iop.org/article
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Rainfall is one of the frequent data used in weather-related studies. Sometimes the data have missing information that needs the treatment to make sure the data can be useful, complete and reliable. There are many methods in treating missing data suggested by previous studies. The best selected method to estimate missing rainfall data in different regions may vary depending on the rainfall pattern and spatial distribution. Therefore, this paper discussed and compared 3 different methods in missing data treatment. The selected methods are Expectation Maximization (EM), Inverse Distance Weighted (IDW) and Multiple Imputation (MI). After analysis, the best method is IDW based on root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (r) and percentage of error (% of error) values. The IDW method has RMSE, MAE values and the lowest % of error values. In addition, the r value of IDW method is highest compared to EM and MI method. MI method recorded the highest values of RMSE, MAE and % of error with the lowest r value that proved MI method is the least accurate method to use in missing data treatment. After all methods were implemented, it proved that the IDW method is the best way to treat missing data because the analysis shows monthly rainfall distribution for 4 treatment stations in line to 3 missing data stations compared to EM and MI methods.