An improved directed random walk framework for cancer classification using gene expression data

Early diagnosis methods in cancer diagnosis studies are making great challenge as they require the involvement of different fields. Deoxyribonucleic acid (DNA) microarray analysis is one of the modern cancer diagnosis techniques used by scientists to measure the gene expression level changes in...

Full description

Saved in:
Bibliographic Details
Main Author: Seah, Choon Sen
Format: Thesis
Language:en
en
en
Published: 2020
Subjects:
Online Access:http://eprints.uthm.edu.my/943/1/24p%20SEAH%20CHOON%20SEN.pdf
http://eprints.uthm.edu.my/943/3/SEAH%20CHOON%20SEN%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/943/2/SEAH%20CHOON%20SEN%20WATERMARK.pdf
http://eprints.uthm.edu.my/943/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1833416689068277760
author Seah, Choon Sen
author_facet Seah, Choon Sen
author_sort Seah, Choon Sen
building UTHM Library
collection Institutional Repository
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
continent Asia
country Malaysia
description Early diagnosis methods in cancer diagnosis studies are making great challenge as they require the involvement of different fields. Deoxyribonucleic acid (DNA) microarray analysis is one of the modern cancer diagnosis techniques used by scientists to measure the gene expression level changes in gene expression data. From the perspective of computing, an algorithm is developed to ease the diagnosis process, but the feasibility is not reliable. Numerous cancer studies have combined different machine learning techniques for the cancer diagnosis to improve the accuracy of cancer classification. This study is conducted to improve the accuracy of cancer classification by introducing an improved directed random walk (DRW) framework. This improved DRW framework is proposed to identify risk pathway while correctly predict the significant genes. It is named as significant directed walk (SDW) because of its ability to identify significant genes for cancer. In this study, six gene expression datasets are applied to study the effectiveness of the sub-algorithm, directed graph and classifier in SDW in terms of cancer prediction and cancer classification. Sub-algorithms of SDW can be further divided into data pre-processing phase, specific tuning parameter selection, weight as additional variable, and exclusion of unwanted adjacency matrix. Besides that, SDW also incorporated four directed graphs to study the usability of the directed graph. The best directed graph among the four is chosen to be part of the structure in SDW. This directed graph is the combination between KEGG pathway and PPI network and named as walker network. The experimental results showed that the combination of SDW with walker network and linear regression is the best among all. SDW achieves an accuracy of 95.03% in average which is higher by 8.97% compare to conventional DRW for all cancer datasets. This study provides a foundation for further studies and research on early diagnosis of cancer with machine learning technique. It is found that these findings would improve the early diagnosis methods of cancer classification.
format Thesis
id my.uthm.eprints-943
institution Universiti Tun Hussein Onn Malaysia
language en
en
en
publishDate 2020
record_format eprints
spelling my.uthm.eprints-9432021-09-09T06:17:34Z http://eprints.uthm.edu.my/943/ An improved directed random walk framework for cancer classification using gene expression data Seah, Choon Sen QA71-90 Instruments and machines Early diagnosis methods in cancer diagnosis studies are making great challenge as they require the involvement of different fields. Deoxyribonucleic acid (DNA) microarray analysis is one of the modern cancer diagnosis techniques used by scientists to measure the gene expression level changes in gene expression data. From the perspective of computing, an algorithm is developed to ease the diagnosis process, but the feasibility is not reliable. Numerous cancer studies have combined different machine learning techniques for the cancer diagnosis to improve the accuracy of cancer classification. This study is conducted to improve the accuracy of cancer classification by introducing an improved directed random walk (DRW) framework. This improved DRW framework is proposed to identify risk pathway while correctly predict the significant genes. It is named as significant directed walk (SDW) because of its ability to identify significant genes for cancer. In this study, six gene expression datasets are applied to study the effectiveness of the sub-algorithm, directed graph and classifier in SDW in terms of cancer prediction and cancer classification. Sub-algorithms of SDW can be further divided into data pre-processing phase, specific tuning parameter selection, weight as additional variable, and exclusion of unwanted adjacency matrix. Besides that, SDW also incorporated four directed graphs to study the usability of the directed graph. The best directed graph among the four is chosen to be part of the structure in SDW. This directed graph is the combination between KEGG pathway and PPI network and named as walker network. The experimental results showed that the combination of SDW with walker network and linear regression is the best among all. SDW achieves an accuracy of 95.03% in average which is higher by 8.97% compare to conventional DRW for all cancer datasets. This study provides a foundation for further studies and research on early diagnosis of cancer with machine learning technique. It is found that these findings would improve the early diagnosis methods of cancer classification. 2020-08 Thesis NonPeerReviewed text en http://eprints.uthm.edu.my/943/1/24p%20SEAH%20CHOON%20SEN.pdf text en http://eprints.uthm.edu.my/943/3/SEAH%20CHOON%20SEN%20COPYRIGHT%20DECLARATION.pdf text en http://eprints.uthm.edu.my/943/2/SEAH%20CHOON%20SEN%20WATERMARK.pdf Seah, Choon Sen (2020) An improved directed random walk framework for cancer classification using gene expression data. Doctoral thesis, Universiti Tun Hussein Onn Malaysia.
spellingShingle QA71-90 Instruments and machines
Seah, Choon Sen
An improved directed random walk framework for cancer classification using gene expression data
title An improved directed random walk framework for cancer classification using gene expression data
title_full An improved directed random walk framework for cancer classification using gene expression data
title_fullStr An improved directed random walk framework for cancer classification using gene expression data
title_full_unstemmed An improved directed random walk framework for cancer classification using gene expression data
title_short An improved directed random walk framework for cancer classification using gene expression data
title_sort improved directed random walk framework for cancer classification using gene expression data
topic QA71-90 Instruments and machines
url http://eprints.uthm.edu.my/943/1/24p%20SEAH%20CHOON%20SEN.pdf
http://eprints.uthm.edu.my/943/3/SEAH%20CHOON%20SEN%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/943/2/SEAH%20CHOON%20SEN%20WATERMARK.pdf
http://eprints.uthm.edu.my/943/
url_provider http://eprints.uthm.edu.my/