Empirical performance evaluation of imputation techniques using medical dataset

This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predic...

Full description

Saved in:
Bibliographic Details
Main Authors: Alade, O. A., Sallehuddin, R., Selamat, A.
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf
http://eprints.utm.my/id/eprint/89908/
https://dx.doi.org/10.1088/1757-899X/551/1/012055
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.89908
record_format eprints
spelling my.utm.899082021-03-04T02:45:01Z http://eprints.utm.my/id/eprint/89908/ Empirical performance evaluation of imputation techniques using medical dataset Alade, O. A. Sallehuddin, R. Selamat, A. QA75 Electronic computers. Computer science This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset. 2019 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf Alade, O. A. and Sallehuddin, R. and Selamat, A. (2019) Empirical performance evaluation of imputation techniques using medical dataset. In: International Conference on Green Engineering Technology and Applied Computing 2019, IConGETech2 019 and International Conference on Applied Computing 2019, ICAC 2019, 4-5 Feb 2019, Eastin Hotel Makkasan, Bangkok, Thailand. https://dx.doi.org/10.1088/1757-899X/551/1/012055
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Alade, O. A.
Sallehuddin, R.
Selamat, A.
Empirical performance evaluation of imputation techniques using medical dataset
description This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset.
format Conference or Workshop Item
author Alade, O. A.
Sallehuddin, R.
Selamat, A.
author_facet Alade, O. A.
Sallehuddin, R.
Selamat, A.
author_sort Alade, O. A.
title Empirical performance evaluation of imputation techniques using medical dataset
title_short Empirical performance evaluation of imputation techniques using medical dataset
title_full Empirical performance evaluation of imputation techniques using medical dataset
title_fullStr Empirical performance evaluation of imputation techniques using medical dataset
title_full_unstemmed Empirical performance evaluation of imputation techniques using medical dataset
title_sort empirical performance evaluation of imputation techniques using medical dataset
publishDate 2019
url http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf
http://eprints.utm.my/id/eprint/89908/
https://dx.doi.org/10.1088/1757-899X/551/1/012055
_version_ 1693725963527913472
score 13.211869