Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin
Approximately 2.3 million women were diagnosed with breast cancer (BC) in 2020 and nearly 30% of women diagnosed with early-stage breast cancer will later develop metastatic disease. Despite the development and discovery of drugs and pharmacotherapy for breast cancer, the 5-year survival rate for pe...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/84271/1/84271.pdf https://ir.uitm.edu.my/id/eprint/84271/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.uitm.ir.84271 |
---|---|
record_format |
eprints |
spelling |
my.uitm.ir.842712024-07-25T11:18:49Z https://ir.uitm.edu.my/id/eprint/84271/ Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin Jalaludin, Nadia Cancer Approximately 2.3 million women were diagnosed with breast cancer (BC) in 2020 and nearly 30% of women diagnosed with early-stage breast cancer will later develop metastatic disease. Despite the development and discovery of drugs and pharmacotherapy for breast cancer, the 5-year survival rate for people with metastatic breast cancer (MBC) remains low. Therefore, the objective of this study is to: a) mine and integrate clinical, phenotype and genotype data that contributes to the occurrence of MBC, b) build a prediction model that can predict possibility of occurrence to metastatic state of breast cancer based on factors previously determined in (a), and c) to validate findings from (a) and (b) through systematic review of randomised controlled trials of MBC. For objective (a), genotype and clinical data was mined from databases such as cBioportal and Genomic Data Common (GDC) portal, and was analysed using principal component analysis (PCA; after feature selection) and multiple correspondence analysis (MCA) in R. The data was then subjected to subsequent pathway mapping, Gene Ontology (GO) mapping and protein-protein interaction (PPI) to investigate its connection to the metastatic phenotype. The odds ratio of mutated gene, disease similarities and hierarchical clustering were also done before all the result was consolidated. For objective (b), prediction model was generated based on the outcome of (a) by using the Random Forest (RF) algorithm and validated by 5-fold cross validation. Additionally, the sensitivity and specificity of each model was also calculated. Meanwhile, for objective (c), six keywords: “metastatic breast cancer, chemotherapy, hormonal therapy, targeted therapy, gene and progression free survival” were used in PubMed, Scopus, Web of Science, Cochrane and Science Direct to further validate the previous findings. Based on all of these evaluations, the findings suggest that mRNA and genetic profiling can differentiate between breast cancer and metastatic breast cancer patients and more attention should be paid to YAP1 and SP7 genes. 2023 Thesis NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/84271/1/84271.pdf Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin. (2023) PhD thesis, thesis, Universiti Teknologi MARA (UiTM). <http://terminalib.uitm.edu.my/84271.pdf> |
institution |
Universiti Teknologi Mara |
building |
Tun Abdul Razak Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Mara |
content_source |
UiTM Institutional Repository |
url_provider |
http://ir.uitm.edu.my/ |
language |
English |
topic |
Cancer |
spellingShingle |
Cancer Jalaludin, Nadia Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
description |
Approximately 2.3 million women were diagnosed with breast cancer (BC) in 2020 and nearly 30% of women diagnosed with early-stage breast cancer will later develop metastatic disease. Despite the development and discovery of drugs and pharmacotherapy for breast cancer, the 5-year survival rate for people with metastatic breast cancer (MBC) remains low. Therefore, the objective of this study is to: a) mine and integrate clinical, phenotype and genotype data that contributes to the occurrence of MBC, b) build a prediction model that can predict possibility of occurrence to metastatic state of breast cancer based on factors previously determined in (a), and c) to validate findings from (a) and (b) through systematic review of randomised controlled trials of MBC. For objective (a), genotype and clinical data was mined from databases such as cBioportal and Genomic Data Common (GDC) portal, and was analysed using principal component analysis (PCA; after feature selection) and multiple correspondence analysis (MCA) in R. The data was then subjected to subsequent pathway mapping, Gene Ontology (GO) mapping and protein-protein interaction (PPI) to investigate its connection to the metastatic phenotype. The odds ratio of mutated gene, disease similarities and hierarchical clustering were also done before all the result was consolidated. For objective (b), prediction model was generated based on the outcome of (a) by using the Random Forest (RF) algorithm and validated by 5-fold cross validation. Additionally, the sensitivity and specificity of each model was also calculated. Meanwhile, for objective (c), six keywords: “metastatic breast cancer, chemotherapy, hormonal therapy, targeted therapy, gene and progression free survival” were used in PubMed, Scopus, Web of Science, Cochrane and Science Direct to further validate the previous findings. Based on all of these evaluations, the findings suggest that mRNA and genetic profiling can differentiate between breast cancer and metastatic breast cancer patients and more attention should be paid to YAP1 and SP7 genes. |
format |
Thesis |
author |
Jalaludin, Nadia |
author_facet |
Jalaludin, Nadia |
author_sort |
Jalaludin, Nadia |
title |
Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
title_short |
Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
title_full |
Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
title_fullStr |
Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
title_full_unstemmed |
Understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / Nadia Jalaludin |
title_sort |
understanding the occurrence of metastatic breast cancer through clinical, phenotype and genotype data, and the employment of machine learning / nadia jalaludin |
publishDate |
2023 |
url |
https://ir.uitm.edu.my/id/eprint/84271/1/84271.pdf https://ir.uitm.edu.my/id/eprint/84271/ |
_version_ |
1806422062905950208 |
score |
13.211869 |