Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method

Coronary heart disease has been ranked as the number one leading cause of death in Malaysia. Based on the recent data published by WHO in 2018, death caused by this disease has reached 34,766 which brought up to 24.69 of the total deaths and places the Malaysian population 64th in the world. Medical...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Syafiq Asyraf, Suhaimi, Nor Azuana, Ramli, Noryanti, Muhammad
Format: Article
Language:English
English
Published: AIP Publishing 2024
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/41052/1/syafiqicoaims.pdf
http://umpir.ump.edu.my/id/eprint/41052/7/Heart%20disease%20prediction%20using%20ensemble%20of%20k-nearest_ABST.pdf
http://umpir.ump.edu.my/id/eprint/41052/
https://doi.org/10.1063/5.0192203
https://doi.org/10.1063/5.0192203
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.41052
record_format eprints
spelling my.ump.umpir.410522024-04-24T04:20:56Z http://umpir.ump.edu.my/id/eprint/41052/ Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method Mohd Syafiq Asyraf, Suhaimi Nor Azuana, Ramli Noryanti, Muhammad QA75 Electronic computers. Computer science RC Internal medicine Coronary heart disease has been ranked as the number one leading cause of death in Malaysia. Based on the recent data published by WHO in 2018, death caused by this disease has reached 34,766 which brought up to 24.69 of the total deaths and places the Malaysian population 64th in the world. Medical researchers all around the world believe that there are multiple circumstances for this disease which include health problems, unhealthy personal habits, genetics, and family history. It is not an easy task to predict heart disease since the study needs a broad range of expertise from many disciplines. Recently, machine learning had been applied as one of the methods to predict heart disease. To test the accuracy of different machine learning methods, this study is conducted by applying the data extracted from the machine learning repository. The proposed predictive modelling in this study was developed using the ensemble method. The ensemble technique used was stacking where logistic regression was used as the meta-level classifier while Random Forest and k-nearest neighbour method were applied as the meta-level classifiers. Results obtained from this study show that the proposed method outperforms other single methods with 82.42 accuracies. Although the accuracy and RMSE of the ensemble method are similar to Random Forest, the proposed method is still the best method since it has a 0.903 value for the area under the ROC and 0.843 value for F1 score. This proposed predictive model will be applied by using smartwatch datasets for future study. AIP Publishing 2024-03-07 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/41052/1/syafiqicoaims.pdf pdf en http://umpir.ump.edu.my/id/eprint/41052/7/Heart%20disease%20prediction%20using%20ensemble%20of%20k-nearest_ABST.pdf Mohd Syafiq Asyraf, Suhaimi and Nor Azuana, Ramli and Noryanti, Muhammad (2024) Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method. AIP Conference Proceedings, 2895 (1). pp. 1-10. ISSN 0094-243X. (Published) https://doi.org/10.1063/5.0192203 https://doi.org/10.1063/5.0192203
institution Universiti Malaysia Pahang Al-Sultan Abdullah
building UMPSA Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang Al-Sultan Abdullah
content_source UMPSA Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
English
topic QA75 Electronic computers. Computer science
RC Internal medicine
spellingShingle QA75 Electronic computers. Computer science
RC Internal medicine
Mohd Syafiq Asyraf, Suhaimi
Nor Azuana, Ramli
Noryanti, Muhammad
Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
description Coronary heart disease has been ranked as the number one leading cause of death in Malaysia. Based on the recent data published by WHO in 2018, death caused by this disease has reached 34,766 which brought up to 24.69 of the total deaths and places the Malaysian population 64th in the world. Medical researchers all around the world believe that there are multiple circumstances for this disease which include health problems, unhealthy personal habits, genetics, and family history. It is not an easy task to predict heart disease since the study needs a broad range of expertise from many disciplines. Recently, machine learning had been applied as one of the methods to predict heart disease. To test the accuracy of different machine learning methods, this study is conducted by applying the data extracted from the machine learning repository. The proposed predictive modelling in this study was developed using the ensemble method. The ensemble technique used was stacking where logistic regression was used as the meta-level classifier while Random Forest and k-nearest neighbour method were applied as the meta-level classifiers. Results obtained from this study show that the proposed method outperforms other single methods with 82.42 accuracies. Although the accuracy and RMSE of the ensemble method are similar to Random Forest, the proposed method is still the best method since it has a 0.903 value for the area under the ROC and 0.843 value for F1 score. This proposed predictive model will be applied by using smartwatch datasets for future study.
format Article
author Mohd Syafiq Asyraf, Suhaimi
Nor Azuana, Ramli
Noryanti, Muhammad
author_facet Mohd Syafiq Asyraf, Suhaimi
Nor Azuana, Ramli
Noryanti, Muhammad
author_sort Mohd Syafiq Asyraf, Suhaimi
title Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
title_short Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
title_full Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
title_fullStr Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
title_full_unstemmed Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
title_sort heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method
publisher AIP Publishing
publishDate 2024
url http://umpir.ump.edu.my/id/eprint/41052/1/syafiqicoaims.pdf
http://umpir.ump.edu.my/id/eprint/41052/7/Heart%20disease%20prediction%20using%20ensemble%20of%20k-nearest_ABST.pdf
http://umpir.ump.edu.my/id/eprint/41052/
https://doi.org/10.1063/5.0192203
https://doi.org/10.1063/5.0192203
_version_ 1822924290264137728
score 13.235362