Heart disease prediction using ensemble of k-nearest neighbour, random forest and logistic regression method

Coronary heart disease has been ranked as the number one leading cause of death in Malaysia. Based on the recent data published by WHO in 2018, death caused by this disease has reached 34,766 which brought up to 24.69 of the total deaths and places the Malaysian population 64th in the world. Medical...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Syafiq Asyraf, Suhaimi, Nor Azuana, Ramli, Noryanti, Muhammad
Format: Article
Language:English
English
Published: AIP Publishing 2024
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/41052/1/syafiqicoaims.pdf
http://umpir.ump.edu.my/id/eprint/41052/7/Heart%20disease%20prediction%20using%20ensemble%20of%20k-nearest_ABST.pdf
http://umpir.ump.edu.my/id/eprint/41052/
https://doi.org/10.1063/5.0192203
https://doi.org/10.1063/5.0192203
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Coronary heart disease has been ranked as the number one leading cause of death in Malaysia. Based on the recent data published by WHO in 2018, death caused by this disease has reached 34,766 which brought up to 24.69 of the total deaths and places the Malaysian population 64th in the world. Medical researchers all around the world believe that there are multiple circumstances for this disease which include health problems, unhealthy personal habits, genetics, and family history. It is not an easy task to predict heart disease since the study needs a broad range of expertise from many disciplines. Recently, machine learning had been applied as one of the methods to predict heart disease. To test the accuracy of different machine learning methods, this study is conducted by applying the data extracted from the machine learning repository. The proposed predictive modelling in this study was developed using the ensemble method. The ensemble technique used was stacking where logistic regression was used as the meta-level classifier while Random Forest and k-nearest neighbour method were applied as the meta-level classifiers. Results obtained from this study show that the proposed method outperforms other single methods with 82.42 accuracies. Although the accuracy and RMSE of the ensemble method are similar to Random Forest, the proposed method is still the best method since it has a 0.903 value for the area under the ROC and 0.843 value for F1 score. This proposed predictive model will be applied by using smartwatch datasets for future study.