A comparative study of nature-inspired metaheuristic algorithms using a three-phase hybrid approach for gene selection and classification in high-dimensional cancer datasets

Identification of informative genes is essential for the disease and cancer studies. Metaheuristic algorithms have been widely used for this purpose. However, their performance on various high-dimensional datasets of genomic studies has not been fully addressed. This work was intended to perform a c...

Full description

Saved in:
Bibliographic Details
Main Authors: Hameed, Shilan S., Hassan, Wan Haslina, Abdul Latiff, Liza, Muhammad, Fahmi F.
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/94946/
http://dx.doi.org/10.1007/s00500-021-05726-0
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Identification of informative genes is essential for the disease and cancer studies. Metaheuristic algorithms have been widely used for this purpose. However, their performance on various high-dimensional datasets of genomic studies has not been fully addressed. This work was intended to perform a comprehensive comparative analysis on three well-known nature-inspired metaheuristic algorithms, namely binary particle swarm optimization (BPSO), genetic algorithm (GA) and cuckoo search algorithm (CS) when they are used in gene selection and classification in twelve high-dimensional cancer datasets. The methodology was carried out through the utilization of a three-phase hybrid approach, considering a pre-processing filtration using Pearson product-moment correlation coefficient (PPMCC) followed by the metaheuristic and classification algorithms. Comparably, five different classification algorithms were used in each phase of analysis. It was seen that the application of PCCMA filter has acted upon reducing the computational complexity of overall analysis. The comparative study showed that BPSO outperformed GA and CS in terms of accuracy. However, CS was able to select fewer attributed genes and was less computationally complex compared to that of GA and BPSO.