Empirical performance evaluation of imputation techniques using medical dataset
This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predic...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf http://eprints.utm.my/id/eprint/89908/ https://dx.doi.org/10.1088/1757-899X/551/1/012055 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.89908 |
---|---|
record_format |
eprints |
spelling |
my.utm.899082021-03-04T02:45:01Z http://eprints.utm.my/id/eprint/89908/ Empirical performance evaluation of imputation techniques using medical dataset Alade, O. A. Sallehuddin, R. Selamat, A. QA75 Electronic computers. Computer science This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset. 2019 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf Alade, O. A. and Sallehuddin, R. and Selamat, A. (2019) Empirical performance evaluation of imputation techniques using medical dataset. In: International Conference on Green Engineering Technology and Applied Computing 2019, IConGETech2 019 and International Conference on Applied Computing 2019, ICAC 2019, 4-5 Feb 2019, Eastin Hotel Makkasan, Bangkok, Thailand. https://dx.doi.org/10.1088/1757-899X/551/1/012055 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Alade, O. A. Sallehuddin, R. Selamat, A. Empirical performance evaluation of imputation techniques using medical dataset |
description |
This paper evaluates the error measures of missing value imputations in medical research. Several imputation techniques have been designed and implemented, however, the evaluation of the degree of deviation of the imputed values from the original values have not been given adequate attention. Predictive Mean Matching Imputation (PMMI) and K-Nearest Neighbour Imputation (KNNI) techniques were implemented on imputation of fertility dataset. The implementation was on three mechanisms of missing values: Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). The results were evaluated by mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE). PMMI performed better than KNNI in all the results. MSE for example, has the ratio of 0.0260/2.8555 (PMMI/KNNI) for 1-10% MAR - 99.09% reduced error rate; 0.1108/3.0120 (PMMI/KNNI) for 30-40% MCAR - 96.32 reduced error rate; and 0.0642/3.7187 (PMMI/KNNI) for 40-50% MNAR - 98.27% reduced error rate. MCAR was the most consistent missingness mechanism for the evaluations. Density distributions of the imputed dataset were compared with the original dataset. The distribution plots of the imputed missing data followed the curve of the original dataset. |
format |
Conference or Workshop Item |
author |
Alade, O. A. Sallehuddin, R. Selamat, A. |
author_facet |
Alade, O. A. Sallehuddin, R. Selamat, A. |
author_sort |
Alade, O. A. |
title |
Empirical performance evaluation of imputation techniques using medical dataset |
title_short |
Empirical performance evaluation of imputation techniques using medical dataset |
title_full |
Empirical performance evaluation of imputation techniques using medical dataset |
title_fullStr |
Empirical performance evaluation of imputation techniques using medical dataset |
title_full_unstemmed |
Empirical performance evaluation of imputation techniques using medical dataset |
title_sort |
empirical performance evaluation of imputation techniques using medical dataset |
publishDate |
2019 |
url |
http://eprints.utm.my/id/eprint/89908/1/OyekaleAbelAlade2019_EmpiricalPerformanceEvaluation.pdf http://eprints.utm.my/id/eprint/89908/ https://dx.doi.org/10.1088/1757-899X/551/1/012055 |
_version_ |
1693725963527913472 |
score |
13.211869 |