Interpretable deep learning radiomics with synthetic data augmentation for breast cancer diagnosis / Pang Ting

Breast cancer is one of the most frequent cancers among women. The capability of Deep Learning Radiomics (DLR) to extract high-level medical imaging features has promoted the use of Computer-aided Diagnosis (CAD) for breast cancer. However, current DLR for breast cancer diagnosis faces some problems...

Full description

Saved in:
Bibliographic Details
Main Author: Pang , Ting
Format: Thesis
Published: 2022
Subjects:
Online Access:http://studentsrepo.um.edu.my/14399/1/Pang_Ting.pdf
http://studentsrepo.um.edu.my/14399/2/Pang_Ting.pdf
http://studentsrepo.um.edu.my/14399/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Breast cancer is one of the most frequent cancers among women. The capability of Deep Learning Radiomics (DLR) to extract high-level medical imaging features has promoted the use of Computer-aided Diagnosis (CAD) for breast cancer. However, current DLR for breast cancer diagnosis faces some problems, for instance, (i) limited datasets, (ii) a very simple diagnosis classification (i.e. binary classification) and (iii) incomprehensible deep learning architectures. To address the aforementioned issues, the main objective of this research is to establish an interpretable DLR framework with synthetic data augmentation for a better clinical application of CAD in breast cancer. First, this thesis introduces a data augmentation architecture using the Generative Adversarial Network (GAN) to address the issue of limited datasets. That is, the GAN model is employed to synthesize data in a semi-supervised manner, thereby alleviating the laborious task of manual labelling of medical images. Then secondly, a deep learning network based on a multi-category classifier is proposed to classify breast cancer according to the Breast Imaging Reporting and Data System (BI-RADS). Herein, it employs the Convolutional Neural Network (CNN) with transfer learning using the generated synthetic data to extract visual features as radiomics for multi-category classification. Finally, this thesis presents a novel reporting model with text attention (RRTA-Net) to achieve explainable DLR in breast cancer diagnosis. That is, it maps the visual features extracted from the CNN to the textual features extracted by Recurrent Neural Network (RNN) to generate a diagnostic report. Extensive experiments were conducted on both the breast ultrasound mass and mammographic calcifications, which are the two most important characteristics of breast cancer. First, it shows that introducing a data augmentation model can generate high-quality and interpretable breast ultrasound mass and mammographic calcification. The correct judgement ratio for the generated data was about 80%, provided by experienced radiologists from the Universiti Malaya Medical Center (UMMC). Secondly, the proposed classification model successfully classified each BI-RADS category with above 95% probability by grasping the standardized features for breast ultrasound mass (shape and margin) and mammographic calcification (morphology and distribution). As such, it improves the diagnostic performance compared to hand-crafted radiomics. Besides that, the transfer learning and synthetic data augmentation strategies also improve breast cancer classification performance compared with other state-of-the-art deep learning approaches. Finally, the report generation model with 87.3% average precision has shown the capability to reduce the labors of writing diagnostic reports and semantically promote the interpretability of DLR. Moreover, it also improves the readability of generated breast cancer reports.