Retinal Layer and Fluid Segmentation with Transformer Based Architecture

Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Net...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Y.K., Kai Wen, K., Lu, C.-K., Lin, C.-H.
Format: Conference or Workshop Item
Published: 2023
Online Access:http://scholars.utp.edu.my/id/eprint/38035/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:scholars.utp.edu.my:38035
record_format eprints
spelling oai:scholars.utp.edu.my:380352023-12-11T02:54:09Z http://scholars.utp.edu.my/id/eprint/38035/ Retinal Layer and Fluid Segmentation with Transformer Based Architecture Wang, Y.K. Kai Wen, K. Lu, C.-K. Lin, C.-H. Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Network(CNN)) have provided a faster way to perform segmentation through a computer-aided diagnosis system. Nevertheless, CNN has limitations, such as a limited receptive field and loss of details. In this project, we propose a transformer-based architecture to segment the retinal layer and fluid from retinal images. The architecture is based on Vision Transformer (ViT) and modified to improve performance. The transformer has been trained on a set of training retinal images and evaluated on a separate set of testing retinal images. The transformer-based architecture demonstrated a 0.01 improvement in average dice coefficient compared to the Unet architecture for fluid and layer segmentation. The Transformer-based architecture is better suited for deployment in commercial portable Optical Coherence Tomography (OCT) devices due to significantly faster inference speed. The proposed model is at most 4 times higher than that of the CNN family models. This makes it an ideal choice for resource-constrained environments where computational resources are limited. © 2023 IEEE. 2023 Conference or Workshop Item NonPeerReviewed Wang, Y.K. and Kai Wen, K. and Lu, C.-K. and Lin, C.-H. (2023) Retinal Layer and Fluid Segmentation with Transformer Based Architecture. In: UNSPECIFIED. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0 10.1109/ICCE-Taiwan58799.2023.10226946 10.1109/ICCE-Taiwan58799.2023.10226946 10.1109/ICCE-Taiwan58799.2023.10226946
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Network(CNN)) have provided a faster way to perform segmentation through a computer-aided diagnosis system. Nevertheless, CNN has limitations, such as a limited receptive field and loss of details. In this project, we propose a transformer-based architecture to segment the retinal layer and fluid from retinal images. The architecture is based on Vision Transformer (ViT) and modified to improve performance. The transformer has been trained on a set of training retinal images and evaluated on a separate set of testing retinal images. The transformer-based architecture demonstrated a 0.01 improvement in average dice coefficient compared to the Unet architecture for fluid and layer segmentation. The Transformer-based architecture is better suited for deployment in commercial portable Optical Coherence Tomography (OCT) devices due to significantly faster inference speed. The proposed model is at most 4 times higher than that of the CNN family models. This makes it an ideal choice for resource-constrained environments where computational resources are limited. © 2023 IEEE.
format Conference or Workshop Item
author Wang, Y.K.
Kai Wen, K.
Lu, C.-K.
Lin, C.-H.
spellingShingle Wang, Y.K.
Kai Wen, K.
Lu, C.-K.
Lin, C.-H.
Retinal Layer and Fluid Segmentation with Transformer Based Architecture
author_facet Wang, Y.K.
Kai Wen, K.
Lu, C.-K.
Lin, C.-H.
author_sort Wang, Y.K.
title Retinal Layer and Fluid Segmentation with Transformer Based Architecture
title_short Retinal Layer and Fluid Segmentation with Transformer Based Architecture
title_full Retinal Layer and Fluid Segmentation with Transformer Based Architecture
title_fullStr Retinal Layer and Fluid Segmentation with Transformer Based Architecture
title_full_unstemmed Retinal Layer and Fluid Segmentation with Transformer Based Architecture
title_sort retinal layer and fluid segmentation with transformer based architecture
publishDate 2023
url http://scholars.utp.edu.my/id/eprint/38035/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0
_version_ 1787138258202787840
score 13.223943