A three-stage method to select informative genes for cancer classification

Microarray technology has provided biologists with the ability to measure the expression levels of thousands of genes in a single experiment. One of the urgent issues in the use of microarray data is the selection of a small subset of genes from the thousands of genes in the data that contributes to...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohamad, Mohd. Saberi, Omatu, Sigeru, Yoshioka, Michifumi, Deris, Safaai
Format: Article
Published: 2010
Subjects:
Online Access:http://eprints.utm.my/id/eprint/22832/
http://www.academia.edu/1320488/A_THREE-STAGE_METHOD_TO_SELECT_INFORMATIVE_GENES_FOR_CANCER_CLASSIFICATION
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microarray technology has provided biologists with the ability to measure the expression levels of thousands of genes in a single experiment. One of the urgent issues in the use of microarray data is the selection of a small subset of genes from the thousands of genes in the data that contributes to a disease. This selection process is difficult due to many irrelevant genes, noisy genes, and the availability of the small number of samples compared to the huge number of genes (high-dimensional data). In this study, we propose a three-stage gene selection method to select a small subset of informative genes that is most relevant for the cancer classification. It has three stages: 1) pre-selecting genes using a filter method to produce a subset of genes; 2) optimising the gene subset using a multi-objective hybrid method to yield near-optimal gene subsets; 3) analyzing the frequency of appearance of each gene in the different near-optimal gene subsets to produce a small subset of informative genes. The experimental results show that our proposed method is capable in selecting the small subset to obtain better classification accuracies than other related previous works as well as five methods experimented in this work. Additionally, a list of informative genes in the final gene subsets is also presented for biological usage.