An integrated framework based deep learning for cancer classification using microarray datasets.

Around the world, cancer is one of the leading reasons of mortality. The importance of earlier detection and prognosis of cancer types is highly significant for patients’ health. In recent research, deep neural networks were trained using gene expression microarray, to classify cancer. Biologists ar...

Full description

Saved in:
Bibliographic Details
Main Authors: Alrefai, Nashat, Ibrahim, Othman, Shehzad, Hafiz Muhammad Faisal, Altigani, Abdelrahman, Abu-ulbeh, Waheeb, Alzaqebah, Malek, Alsmadi, Mutasem K.
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2023
Subjects:
Online Access:http://eprints.utm.my/106234/
http://dx.doi.org/10.1007/s12652-022-04482-9
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Around the world, cancer is one of the leading reasons of mortality. The importance of earlier detection and prognosis of cancer types is highly significant for patients’ health. In recent research, deep neural networks were trained using gene expression microarray, to classify cancer. Biologists are able to monitor thousands of genes in one experiment using microarray technology. Microarray datasets are considered high-dimensional data, as they are cluttered with irrelevant, redundant, and noisy genes that contribute insignificantly to classification. The most informative genes contributing to cancer classification have been identified using computational intelligence algorithms. In this paper, we propose an integrated framework for cancer classification. This framework is divided into three tasks. Firstly, particle swarm optimization with ensemble learning (PSO-ensemble) reduces the microarray dataset's high dimensionality. Secondly, The Adaptive self-training method (ASTM) is used to solve low-size issues. Finally, a Convolutional Neural Network (CNN) was employed for classification. CNN has the ability to discover the complex non-linear relationships between features and select the most informative. Transfer learning was used sequentially with CNN to integrate the classification procedure because it can reduce the training time and computational complexity. Six microarray datasets are used, namely liver, breast, colon, prostate, central nervous system, and lung. The proposed CNN architecture with transfer learning provided 100% classification accuracy for colon, prostate, CNS and lung microarray datasets, and 97.62%, 95.45% accuracy for liver and breast cancer respectively. Experiments show that our proposed method delivers the highest classification accuracy and reduces training time with the smallest gene subset.