A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation

Breast cancer is one of the leading causes of death and most frequently diagnosed cancer amongst women. Annually, almost half a million women do not survive the disease and die from breast cancer. Machine learning is a subfield of artificial intelligence (AI) and computer science that uses data and...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Ali, Nursabillilah, Besar, Rosli, Ab Aziz, Nor Azlina
Format: Article
Language:English
Published: Institute of Advanced Engineering and Science 2023
Online Access:http://eprints.utem.edu.my/id/eprint/26481/2/4838-13012-1-PB.PDF
http://eprints.utem.edu.my/id/eprint/26481/
https://beei.org/index.php/EEI/article/view/4838/3270
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utem.eprints.26481
record_format eprints
spelling my.utem.eprints.264812023-02-28T08:03:58Z http://eprints.utem.edu.my/id/eprint/26481/ A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation Mohd Ali, Nursabillilah Besar, Rosli Ab Aziz, Nor Azlina Breast cancer is one of the leading causes of death and most frequently diagnosed cancer amongst women. Annually, almost half a million women do not survive the disease and die from breast cancer. Machine learning is a subfield of artificial intelligence (AI) and computer science that uses data and algorithms to mimic how humans learn, and gradually improving its accuracy. In this work, simple machine learning methods are used to classify breast cancer microarray data to normal and relapse. The data is from the gene expression omnibus (GEO) website namely GSE45255 and GSE15852. These two datasets are integrated and combined to form a single dataset. The study involved three machine learning algorithms, random forest (RF), extra tree (ET), and support vector machine (SVM). Grid search cross validation (CV) is applied for hyperparameter tuning of the algorithms. The result shows that the tuned SVM is best among the tested algorithms with accuracy of 97.78%. In the future it is recommended to include feature selection method to get the optimal features and better classification accuracies. Institute of Advanced Engineering and Science 2023-04 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/26481/2/4838-13012-1-PB.PDF Mohd Ali, Nursabillilah and Besar, Rosli and Ab Aziz, Nor Azlina (2023) A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation. Bulletin of Electrical Engineering and Informatics, 12 (2). pp. 1047-1054. ISSN 2302-9285 https://beei.org/index.php/EEI/article/view/4838/3270 10.11591/eei.v12i2.4838
institution Universiti Teknikal Malaysia Melaka
building UTEM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknikal Malaysia Melaka
content_source UTEM Institutional Repository
url_provider http://eprints.utem.edu.my/
language English
description Breast cancer is one of the leading causes of death and most frequently diagnosed cancer amongst women. Annually, almost half a million women do not survive the disease and die from breast cancer. Machine learning is a subfield of artificial intelligence (AI) and computer science that uses data and algorithms to mimic how humans learn, and gradually improving its accuracy. In this work, simple machine learning methods are used to classify breast cancer microarray data to normal and relapse. The data is from the gene expression omnibus (GEO) website namely GSE45255 and GSE15852. These two datasets are integrated and combined to form a single dataset. The study involved three machine learning algorithms, random forest (RF), extra tree (ET), and support vector machine (SVM). Grid search cross validation (CV) is applied for hyperparameter tuning of the algorithms. The result shows that the tuned SVM is best among the tested algorithms with accuracy of 97.78%. In the future it is recommended to include feature selection method to get the optimal features and better classification accuracies.
format Article
author Mohd Ali, Nursabillilah
Besar, Rosli
Ab Aziz, Nor Azlina
spellingShingle Mohd Ali, Nursabillilah
Besar, Rosli
Ab Aziz, Nor Azlina
A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
author_facet Mohd Ali, Nursabillilah
Besar, Rosli
Ab Aziz, Nor Azlina
author_sort Mohd Ali, Nursabillilah
title A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
title_short A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
title_full A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
title_fullStr A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
title_full_unstemmed A case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
title_sort case study of microarray breast cancer classification using machine learning algorithms with grid search cross validation
publisher Institute of Advanced Engineering and Science
publishDate 2023
url http://eprints.utem.edu.my/id/eprint/26481/2/4838-13012-1-PB.PDF
http://eprints.utem.edu.my/id/eprint/26481/
https://beei.org/index.php/EEI/article/view/4838/3270
_version_ 1759693074133417984
score 13.211869