Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm

The detection, recognition and conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive type of OCR is used to process Arabic characters, namely, Arabic Optical Character Recognition (AOCR). OCR is increasingly used in many applications, w...

Full description

Saved in:
Bibliographic Details
Main Author: Alotaibi, Faiz E A L
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/90723/1/FSKTM%202019%2059%20IR.pdf
http://psasir.upm.edu.my/id/eprint/90723/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.90723
record_format eprints
spelling my.upm.eprints.907232021-09-12T13:34:01Z http://psasir.upm.edu.my/id/eprint/90723/ Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm Alotaibi, Faiz E A L The detection, recognition and conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive type of OCR is used to process Arabic characters, namely, Arabic Optical Character Recognition (AOCR). OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human intervention. The Quranic handwriting text contains two elements, namely, diacritics and characters. However, the current Arabic handwritten OCR system produces low levels of accuracy and no research focused on Quran image recognition. The current AOCR inaccurately recognizes diacritic and characters, and the research and efforts in the area of AOCR are insufficient. Many studies have been carried out so far, but for Quran handwriting has not been researched as thoroughly as Arabic, Latin or Chinese handwritten systems. The current research is focused on solving the mentioned problems through improving the accuracy of recognition rate of AOCR by proposing a new segmentation, feature extraction methods and finding a suitable classification. In this thesis, a new techniques, methods and algorithms are proposed to check the similarities and originalities of the Quranic handwriting content. The diacritic detections are performed using a region-based algorithm with 89% accuracy and 95% improved by using flood fill segmentations method. 2DMED feature extraction accuracy was 90% for diacritics and 96% improved by applied CNN. Character recognition is performed based on the projection method with 86% accuracy, and 92% improved by using flood fill. 2DMED in characters was 88% and 91 % after improved by applied CNN. For classification, KNN used before and after enhancement technique based on essential vector with our dataset, the diacritic accuracy was 96.4286% after enhancement, which is better than the 87.5020% in detecting. For characters was at 92.3077% improvement, which is better that normal KNN algorithm which exhibited an 86.1429% accuracy in detecting. 2019-09 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/90723/1/FSKTM%202019%2059%20IR.pdf Alotaibi, Faiz E A L (2019) Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm. Doctoral thesis, Universiti Putra Malaysia. Optical character recognition devices - Software Diacritics - Data processing
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
topic Optical character recognition devices - Software
Diacritics - Data processing
spellingShingle Optical character recognition devices - Software
Diacritics - Data processing
Alotaibi, Faiz E A L
Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
description The detection, recognition and conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive type of OCR is used to process Arabic characters, namely, Arabic Optical Character Recognition (AOCR). OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human intervention. The Quranic handwriting text contains two elements, namely, diacritics and characters. However, the current Arabic handwritten OCR system produces low levels of accuracy and no research focused on Quran image recognition. The current AOCR inaccurately recognizes diacritic and characters, and the research and efforts in the area of AOCR are insufficient. Many studies have been carried out so far, but for Quran handwriting has not been researched as thoroughly as Arabic, Latin or Chinese handwritten systems. The current research is focused on solving the mentioned problems through improving the accuracy of recognition rate of AOCR by proposing a new segmentation, feature extraction methods and finding a suitable classification. In this thesis, a new techniques, methods and algorithms are proposed to check the similarities and originalities of the Quranic handwriting content. The diacritic detections are performed using a region-based algorithm with 89% accuracy and 95% improved by using flood fill segmentations method. 2DMED feature extraction accuracy was 90% for diacritics and 96% improved by applied CNN. Character recognition is performed based on the projection method with 86% accuracy, and 92% improved by using flood fill. 2DMED in characters was 88% and 91 % after improved by applied CNN. For classification, KNN used before and after enhancement technique based on essential vector with our dataset, the diacritic accuracy was 96.4286% after enhancement, which is better than the 87.5020% in detecting. For characters was at 92.3077% improvement, which is better that normal KNN algorithm which exhibited an 86.1429% accuracy in detecting.
format Thesis
author Alotaibi, Faiz E A L
author_facet Alotaibi, Faiz E A L
author_sort Alotaibi, Faiz E A L
title Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
title_short Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
title_full Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
title_fullStr Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
title_full_unstemmed Quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
title_sort quranic diacritic and character segmentation and recognition using flood fill and k-nearest neighbors algorithm
publishDate 2019
url http://psasir.upm.edu.my/id/eprint/90723/1/FSKTM%202019%2059%20IR.pdf
http://psasir.upm.edu.my/id/eprint/90723/
_version_ 1712286799366193152
score 13.211869