A comparative analysis of missing data imputation techniques on sedimentation data
Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sedi...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier Ltd.
2024
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/44864/2/A%20comparative%20analysi.pdf http://ir.unimas.my/id/eprint/44864/ https://www.sciencedirect.com/science/article/pii/S2090447924000923#:~:text=A%20comparative%20analysis%20on%20the,imputation%20(SI)%20and%20multiple%20imputation https://doi.org/10.1016/j.asej.2024.102717 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.unimas.ir.44864 |
---|---|
record_format |
eprints |
spelling |
my.unimas.ir.448642024-05-27T03:31:07Z http://ir.unimas.my/id/eprint/44864/ A comparative analysis of missing data imputation techniques on sedimentation data Loh, Wing Son Lloyd, Ling Chin, Ren Jie Lai, Sai Hin Loo, Kar Kuan Seah, Choon Sen TA Engineering (General). Civil engineering (General) Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sediment data imputation performance was made based on four different techniques, namely the k-Nearest Neighbourhood (k-NN), Support Vector Regression (SVR), Multiple Regression (MR), and Artificial Neural Network (ANN), under the single imputation (SI) and multiple imputation (MI) regimes. Across different missing data proportions (10%-50%), the ANN demonstrated optimal results with consistent performance metrics recorded over both SI and MI regimes. For the highest missing data proportion (50%), the ANN presented the best imputation performance with a reported root mean squared error (RMSE) 0.000882, mean absolute error (MAE) 0.000595, coefficient of determination (R2 ) 71%, and Kling-Gupta Efficiency (KGE) 72%. The imputation performance ranking is as follows: ANN, SVR, MR, and k-NN. Elsevier Ltd. 2024 Article PeerReviewed text en http://ir.unimas.my/id/eprint/44864/2/A%20comparative%20analysi.pdf Loh, Wing Son and Lloyd, Ling and Chin, Ren Jie and Lai, Sai Hin and Loo, Kar Kuan and Seah, Choon Sen (2024) A comparative analysis of missing data imputation techniques on sedimentation data. Ain Shams Engineering Journal, 15 (6). pp. 1-20. ISSN 2090-4495 https://www.sciencedirect.com/science/article/pii/S2090447924000923#:~:text=A%20comparative%20analysis%20on%20the,imputation%20(SI)%20and%20multiple%20imputation https://doi.org/10.1016/j.asej.2024.102717 |
institution |
Universiti Malaysia Sarawak |
building |
Centre for Academic Information Services (CAIS) |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sarawak |
content_source |
UNIMAS Institutional Repository |
url_provider |
http://ir.unimas.my/ |
language |
English |
topic |
TA Engineering (General). Civil engineering (General) |
spellingShingle |
TA Engineering (General). Civil engineering (General) Loh, Wing Son Lloyd, Ling Chin, Ren Jie Lai, Sai Hin Loo, Kar Kuan Seah, Choon Sen A comparative analysis of missing data imputation techniques on sedimentation data |
description |
Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sediment data imputation performance was made based on four different techniques, namely the k-Nearest Neighbourhood (k-NN), Support Vector Regression (SVR), Multiple Regression (MR), and Artificial Neural Network (ANN), under the single imputation (SI) and multiple imputation (MI) regimes. Across different missing data proportions (10%-50%), the ANN demonstrated optimal results with consistent performance metrics recorded over both SI and MI regimes. For the highest missing data proportion (50%), the ANN presented the best imputation performance with a reported root mean squared error (RMSE) 0.000882, mean absolute error (MAE) 0.000595, coefficient of
determination (R2 ) 71%, and Kling-Gupta Efficiency (KGE) 72%. The imputation performance ranking is as follows: ANN, SVR, MR, and k-NN. |
format |
Article |
author |
Loh, Wing Son Lloyd, Ling Chin, Ren Jie Lai, Sai Hin Loo, Kar Kuan Seah, Choon Sen |
author_facet |
Loh, Wing Son Lloyd, Ling Chin, Ren Jie Lai, Sai Hin Loo, Kar Kuan Seah, Choon Sen |
author_sort |
Loh, Wing Son |
title |
A comparative analysis of missing data imputation techniques on sedimentation data |
title_short |
A comparative analysis of missing data imputation techniques on sedimentation data |
title_full |
A comparative analysis of missing data imputation techniques on sedimentation data |
title_fullStr |
A comparative analysis of missing data imputation techniques on sedimentation data |
title_full_unstemmed |
A comparative analysis of missing data imputation techniques on sedimentation data |
title_sort |
comparative analysis of missing data imputation techniques on sedimentation data |
publisher |
Elsevier Ltd. |
publishDate |
2024 |
url |
http://ir.unimas.my/id/eprint/44864/2/A%20comparative%20analysi.pdf http://ir.unimas.my/id/eprint/44864/ https://www.sciencedirect.com/science/article/pii/S2090447924000923#:~:text=A%20comparative%20analysis%20on%20the,imputation%20(SI)%20and%20multiple%20imputation https://doi.org/10.1016/j.asej.2024.102717 |
_version_ |
1800728212535246848 |
score |
13.211869 |