A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation

Breast cancer is one of the leading causes of death and most frequently diagnosed cancer amongst women. Annually, almost half a million women do not survive the disease and die from breast cancer. Machine learning is a subfield of artificial intelligence (AI) and computer science that uses data and...

全面介紹

Saved in:
書目詳細資料
Main Authors: Mohd Ali, Nursabillilah, Besar, Rosli, Ab Aziz, Nor Azlina
格式: Article
語言:English
出版: Institute of Advanced Engineering and Science 2023
在線閱讀:http://eprints.utem.edu.my/id/eprint/26481/2/4838-13012-1-PB.PDF
http://eprints.utem.edu.my/id/eprint/26481/
https://beei.org/index.php/EEI/article/view/4838/3270
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Breast cancer is one of the leading causes of death and most frequently diagnosed cancer amongst women. Annually, almost half a million women do not survive the disease and die from breast cancer. Machine learning is a subfield of artificial intelligence (AI) and computer science that uses data and algorithms to mimic how humans learn, and gradually improving its accuracy. In this work, simple machine learning methods are used to classify breast cancer microarray data to normal and relapse. The data is from the gene expression omnibus (GEO) website namely GSE45255 and GSE15852. These two datasets are integrated and combined to form a single dataset. The study involved three machine learning algorithms, random forest (RF), extra tree (ET), and support vector machine (SVM). Grid search cross validation (CV) is applied for hyperparameter tuning of the algorithms. The result shows that the tuned SVM is best among the tested algorithms with accuracy of 97.78%. In the future it is recommended to include feature selection method to get the optimal features and better classification accuracies.