Deep Learning for Optical Character Recognition of Arabic Text

Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script have been investigated over several decades ago. However, the recent advancement of deep learning has attracted many to improve existing solutions to OCR. Arabic writing system has its own distinctive style. Words...

Full description

Saved in:
Bibliographic Details
Main Author: Rahmat, Mustaien Fathur Rahim
Format: Final Year Project
Language:English
Published: IRC 2020
Subjects:
Online Access:http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf
http://utpedia.utp.edu.my/21779/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utp-utpedia.21779
record_format eprints
spelling my-utp-utpedia.217792021-09-23T23:38:19Z http://utpedia.utp.edu.my/21779/ Deep Learning for Optical Character Recognition of Arabic Text Rahmat, Mustaien Fathur Rahim Q Science (General) Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script have been investigated over several decades ago. However, the recent advancement of deep learning has attracted many to improve existing solutions to OCR. Arabic writing system has its own distinctive style. Words are written from the right to the left. Features like ligatures, diacritics and vowel markings are commonly included in writing. Some characters may extend and overlap on top of another characters. This may cause difficulties when training models using conventional training method. Therefore, techniques chosen should be able recognize such features. Since Arabic calligraphy exists in many styles, this study will only focus on recognizing printed and written khat naskh. Using neural networks technology with the help of enormous and reliable datasets, models can be trained to get high accuracy and precision in recognizing the text. In this study, a hybrid neural network model will be built, which will concern on feature extraction and classification as to encounter difficulties in OCR of Arabic text. IRC 2020-01 Final Year Project NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf Rahmat, Mustaien Fathur Rahim (2020) Deep Learning for Optical Character Recognition of Arabic Text. IRC, Universiti Teknologi PETRONAS. (Submitted)
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Electronic and Digitized Intellectual Asset
url_provider http://utpedia.utp.edu.my/
language English
topic Q Science (General)
spellingShingle Q Science (General)
Rahmat, Mustaien Fathur Rahim
Deep Learning for Optical Character Recognition of Arabic Text
description Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script have been investigated over several decades ago. However, the recent advancement of deep learning has attracted many to improve existing solutions to OCR. Arabic writing system has its own distinctive style. Words are written from the right to the left. Features like ligatures, diacritics and vowel markings are commonly included in writing. Some characters may extend and overlap on top of another characters. This may cause difficulties when training models using conventional training method. Therefore, techniques chosen should be able recognize such features. Since Arabic calligraphy exists in many styles, this study will only focus on recognizing printed and written khat naskh. Using neural networks technology with the help of enormous and reliable datasets, models can be trained to get high accuracy and precision in recognizing the text. In this study, a hybrid neural network model will be built, which will concern on feature extraction and classification as to encounter difficulties in OCR of Arabic text.
format Final Year Project
author Rahmat, Mustaien Fathur Rahim
author_facet Rahmat, Mustaien Fathur Rahim
author_sort Rahmat, Mustaien Fathur Rahim
title Deep Learning for Optical Character Recognition of Arabic Text
title_short Deep Learning for Optical Character Recognition of Arabic Text
title_full Deep Learning for Optical Character Recognition of Arabic Text
title_fullStr Deep Learning for Optical Character Recognition of Arabic Text
title_full_unstemmed Deep Learning for Optical Character Recognition of Arabic Text
title_sort deep learning for optical character recognition of arabic text
publisher IRC
publishDate 2020
url http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf
http://utpedia.utp.edu.my/21779/
_version_ 1739832911365931008
score 13.211869