Deep Learning for Optical Character Recognition of Arabic Text
Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script have been investigated over several decades ago. However, the recent advancement of deep learning has attracted many to improve existing solutions to OCR. Arabic writing system has its own distinctive style. Words...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project |
Language: | English |
Published: |
IRC
2020
|
Subjects: | |
Online Access: | http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf http://utpedia.utp.edu.my/21779/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utp-utpedia.21779 |
---|---|
record_format |
eprints |
spelling |
my-utp-utpedia.217792021-09-23T23:38:19Z http://utpedia.utp.edu.my/21779/ Deep Learning for Optical Character Recognition of Arabic Text Rahmat, Mustaien Fathur Rahim Q Science (General) Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script have been investigated over several decades ago. However, the recent advancement of deep learning has attracted many to improve existing solutions to OCR. Arabic writing system has its own distinctive style. Words are written from the right to the left. Features like ligatures, diacritics and vowel markings are commonly included in writing. Some characters may extend and overlap on top of another characters. This may cause difficulties when training models using conventional training method. Therefore, techniques chosen should be able recognize such features. Since Arabic calligraphy exists in many styles, this study will only focus on recognizing printed and written khat naskh. Using neural networks technology with the help of enormous and reliable datasets, models can be trained to get high accuracy and precision in recognizing the text. In this study, a hybrid neural network model will be built, which will concern on feature extraction and classification as to encounter difficulties in OCR of Arabic text. IRC 2020-01 Final Year Project NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf Rahmat, Mustaien Fathur Rahim (2020) Deep Learning for Optical Character Recognition of Arabic Text. IRC, Universiti Teknologi PETRONAS. (Submitted) |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Electronic and Digitized Intellectual Asset |
url_provider |
http://utpedia.utp.edu.my/ |
language |
English |
topic |
Q Science (General) |
spellingShingle |
Q Science (General) Rahmat, Mustaien Fathur Rahim Deep Learning for Optical Character Recognition of Arabic Text |
description |
Optical Character Recognition (OCR) of non-Latin scripts, such as Arabic script
have been investigated over several decades ago. However, the recent advancement
of deep learning has attracted many to improve existing solutions to OCR. Arabic
writing system has its own distinctive style. Words are written from the right to the
left. Features like ligatures, diacritics and vowel markings are commonly included in
writing. Some characters may extend and overlap on top of another characters. This
may cause difficulties when training models using conventional training method.
Therefore, techniques chosen should be able recognize such features. Since Arabic
calligraphy exists in many styles, this study will only focus on recognizing printed
and written khat naskh. Using neural networks technology with the help of enormous
and reliable datasets, models can be trained to get high accuracy and precision in
recognizing the text. In this study, a hybrid neural network model will be built, which
will concern on feature extraction and classification as to encounter difficulties in
OCR of Arabic text. |
format |
Final Year Project |
author |
Rahmat, Mustaien Fathur Rahim |
author_facet |
Rahmat, Mustaien Fathur Rahim |
author_sort |
Rahmat, Mustaien Fathur Rahim |
title |
Deep Learning for Optical Character Recognition of Arabic Text |
title_short |
Deep Learning for Optical Character Recognition of Arabic Text |
title_full |
Deep Learning for Optical Character Recognition of Arabic Text |
title_fullStr |
Deep Learning for Optical Character Recognition of Arabic Text |
title_full_unstemmed |
Deep Learning for Optical Character Recognition of Arabic Text |
title_sort |
deep learning for optical character recognition of arabic text |
publisher |
IRC |
publishDate |
2020 |
url |
http://utpedia.utp.edu.my/21779/1/23259_Mustaien%20Fathur%20Rahim%20Rahmat.pdf http://utpedia.utp.edu.my/21779/ |
_version_ |
1739832911365931008 |
score |
13.211869 |