Performance comparison of feature selection methods for prediction in medical data

Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in additio...

詳細記述

保存先:
書誌詳細
主要な著者: Mohd Khalid, Nur Hidayah, Ismail, Amelia Ritahani, Abdul Aziz, Normaziah, Amir Hussin, Amir 'Aatieff
フォーマット: Conference or Workshop Item
言語:English
English
出版事項: Springer Nature 2023
主題:
オンライン・アクセス:http://irep.iium.edu.my/105807/6/105807_Performance%20comparison%20of%20feature%20selection.PDF
http://irep.iium.edu.my/105807/7/105807_Performance%20comparison%20of%20feature%20selection_Scopus.pdf
http://irep.iium.edu.my/105807/
https://link.springer.com/chapter/10.1007/978-981-99-0405-1_7
https://doi.org/10.1007/978-981-99-0405-1_7
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
その他の書誌記述
要約:Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in addition, it is chal- lenging to identify and extract the information that will be relevant and meaning- ful for machine learning modelling. Not to mention, the efficacy of the predictive machine learning algorithm can be enhanced by using only useful and pertinent information. Hence, feature selection is proposed to determine the significant fea- tures. Thus, feature selection should be fully utilized and applied when building machine learning algorithm. This study analyzes filter, wrapper, and embedded feature selection methods for medical data with the predictive machine learn- ing algorithm, Random Forest and CatBoost. The experiment is carried out by evaluating the performances of the machine learning with and without applying feature selection methods. According to the results, CatBoost with RFE shows the best performance, in comparison to Random Forest with other feature selection methods.