Retinal Layer and Fluid Segmentation with Transformer Based Architecture
Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Net...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2023
|
Online Access: | http://scholars.utp.edu.my/id/eprint/38035/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
oai:scholars.utp.edu.my:38035 |
---|---|
record_format |
eprints |
spelling |
oai:scholars.utp.edu.my:380352023-12-11T02:54:09Z http://scholars.utp.edu.my/id/eprint/38035/ Retinal Layer and Fluid Segmentation with Transformer Based Architecture Wang, Y.K. Kai Wen, K. Lu, C.-K. Lin, C.-H. Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Network(CNN)) have provided a faster way to perform segmentation through a computer-aided diagnosis system. Nevertheless, CNN has limitations, such as a limited receptive field and loss of details. In this project, we propose a transformer-based architecture to segment the retinal layer and fluid from retinal images. The architecture is based on Vision Transformer (ViT) and modified to improve performance. The transformer has been trained on a set of training retinal images and evaluated on a separate set of testing retinal images. The transformer-based architecture demonstrated a 0.01 improvement in average dice coefficient compared to the Unet architecture for fluid and layer segmentation. The Transformer-based architecture is better suited for deployment in commercial portable Optical Coherence Tomography (OCT) devices due to significantly faster inference speed. The proposed model is at most 4 times higher than that of the CNN family models. This makes it an ideal choice for resource-constrained environments where computational resources are limited. © 2023 IEEE. 2023 Conference or Workshop Item NonPeerReviewed Wang, Y.K. and Kai Wen, K. and Lu, C.-K. and Lin, C.-H. (2023) Retinal Layer and Fluid Segmentation with Transformer Based Architecture. In: UNSPECIFIED. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0 10.1109/ICCE-Taiwan58799.2023.10226946 10.1109/ICCE-Taiwan58799.2023.10226946 10.1109/ICCE-Taiwan58799.2023.10226946 |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Institutional Repository |
url_provider |
http://eprints.utp.edu.my/ |
description |
Retinal layer and fluid segmentation is a critical task in assisting doctors to diagnose retinal diseases. Manual segmentation by experts provides the highest accuracy, but it is time-consuming and inconsistent if segmented by different experts. Deep learning algorithms(e.g. Convolutional Neural Network(CNN)) have provided a faster way to perform segmentation through a computer-aided diagnosis system. Nevertheless, CNN has limitations, such as a limited receptive field and loss of details. In this project, we propose a transformer-based architecture to segment the retinal layer and fluid from retinal images. The architecture is based on Vision Transformer (ViT) and modified to improve performance. The transformer has been trained on a set of training retinal images and evaluated on a separate set of testing retinal images. The transformer-based architecture demonstrated a 0.01 improvement in average dice coefficient compared to the Unet architecture for fluid and layer segmentation. The Transformer-based architecture is better suited for deployment in commercial portable Optical Coherence Tomography (OCT) devices due to significantly faster inference speed. The proposed model is at most 4 times higher than that of the CNN family models. This makes it an ideal choice for resource-constrained environments where computational resources are limited. © 2023 IEEE. |
format |
Conference or Workshop Item |
author |
Wang, Y.K. Kai Wen, K. Lu, C.-K. Lin, C.-H. |
spellingShingle |
Wang, Y.K. Kai Wen, K. Lu, C.-K. Lin, C.-H. Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
author_facet |
Wang, Y.K. Kai Wen, K. Lu, C.-K. Lin, C.-H. |
author_sort |
Wang, Y.K. |
title |
Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
title_short |
Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
title_full |
Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
title_fullStr |
Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
title_full_unstemmed |
Retinal Layer and Fluid Segmentation with Transformer Based Architecture |
title_sort |
retinal layer and fluid segmentation with transformer based architecture |
publishDate |
2023 |
url |
http://scholars.utp.edu.my/id/eprint/38035/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174903554&doi=10.1109%2fICCE-Taiwan58799.2023.10226946&partnerID=40&md5=d29ca45159bc5c695c523d0eaf6b05b0 |
_version_ |
1787138258202787840 |
score |
13.223943 |