Performance comparison of feature selection methods for prediction in medical data

Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in additio...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Khalid, Nur Hidayah, Ismail, Amelia Ritahani, Abdul Aziz, Normaziah, Amir Hussin, Amir 'Aatieff
Format: Conference or Workshop Item
Language:English
English
Published: Springer Nature 2023
Subjects:
Online Access:http://irep.iium.edu.my/105807/6/105807_Performance%20comparison%20of%20feature%20selection.PDF
http://irep.iium.edu.my/105807/7/105807_Performance%20comparison%20of%20feature%20selection_Scopus.pdf
http://irep.iium.edu.my/105807/
https://link.springer.com/chapter/10.1007/978-981-99-0405-1_7
https://doi.org/10.1007/978-981-99-0405-1_7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in addition, it is chal- lenging to identify and extract the information that will be relevant and meaning- ful for machine learning modelling. Not to mention, the efficacy of the predictive machine learning algorithm can be enhanced by using only useful and pertinent information. Hence, feature selection is proposed to determine the significant fea- tures. Thus, feature selection should be fully utilized and applied when building machine learning algorithm. This study analyzes filter, wrapper, and embedded feature selection methods for medical data with the predictive machine learn- ing algorithm, Random Forest and CatBoost. The experiment is carried out by evaluating the performances of the machine learning with and without applying feature selection methods. According to the results, CatBoost with RFE shows the best performance, in comparison to Random Forest with other feature selection methods.