Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation

Image segmentation of offline Arabic handwritten documents is an active research area but requires efforts to segment image into regions compared to human vision, especially for degraded handwritten historical documents. Therefore, these valuable degraded handwritten documents attract researchers fr...

Full description

Saved in:
Bibliographic Details
Main Author: Ullah, Inam
Format: Thesis
Language:English
English
Published: 2020
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/25386/1/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf
http://eprints.utem.edu.my/id/eprint/25386/2/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf
http://eprints.utem.edu.my/id/eprint/25386/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=119742
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utem.eprints.25386
record_format eprints
institution Universiti Teknikal Malaysia Melaka
building UTEM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknikal Malaysia Melaka
content_source UTEM Institutional Repository
url_provider http://eprints.utem.edu.my/
language English
English
topic Q Science (General)
QA Mathematics
spellingShingle Q Science (General)
QA Mathematics
Ullah, Inam
Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
description Image segmentation of offline Arabic handwritten documents is an active research area but requires efforts to segment image into regions compared to human vision, especially for degraded handwritten historical documents. Therefore, these valuable degraded handwritten documents attract researchers from all around the world but facing problems in segmentation of Arabic text because of overlapping and touching character. The overlapping and touching of character occurs by not following the standard rule of writing where, two or more characters share the same space and these touching characters are considered as one sub-word. At present many techniques are available for touching handwritten character segmentation by using the concept of connected components. These methods are easy to implement and provide high accuracy in some cases but they fail in many cases because some manual decision value is required to determine the correct segmentation path near junction point, which produce unstable character boundary. Besides, these methods are unstable when applied to handwritten characters having loops or circular path in both touching characters. In this case, the cut-point is located in incorrect place, which can lead to incorrect dividing path of a character boundary. The selection of path near junction point is one of the main challenge in segmentation of connected components. Currently, these methods contain many disadvantages usually implemented for only one layout and fonts types because of variation in writing. Apart from connected components methods, template based segmentation is another available method where several studies have been developed based on template creation for touching characters. The disadvantage is creating many templates for all possible touching types. Therefore, due to variation in writing connected components methods still unexplored especially for the cursive based handwriting like Arabic and Jawi. In this work, three objectives are highlighted, first is to identify junction point of touching image, second is to formulate direction near junction point and third is for segmentation of touching characters. The research methodology consists of three proposed ideas: junction point detection, formulate direction and segmentation stage. In junction point identification stage overlapping set theory is used to identify the segmentation point of the two touching characters. In formulate direction stage; gradient technique is used to formulate the right direction near junction point. In segmentation stage contour tracing technique is used to segment the two touching character into isolated characters. The three proposed methods were tested on IFN/ENIT, AHDB and IAM datasets. Experiments were conducted on finding of junction point where success rate is 93.3%, for the second proposed method, the success rate is 98% and last proposed segmentation method is 97.27%. In conclusion, the proposed segmentation method outperforms the existing research in term of accuracy. Proposed methods do not use any recognizer or template to control segmentation accuracy. Finally, the proposed segmentation method was again compared with state of the art methods, and it also gained better accuracy rate for degraded, non-degraded document images and the accuracy for the overall processes for AHDB is about 97.45% and 85.03% for IAM dataset.
format Thesis
author Ullah, Inam
author_facet Ullah, Inam
author_sort Ullah, Inam
title Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
title_short Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
title_full Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
title_fullStr Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
title_full_unstemmed Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation
title_sort segmentation of two touching handwritten arabic characters using overlapping set theory and gradient orientation
publishDate 2020
url http://eprints.utem.edu.my/id/eprint/25386/1/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf
http://eprints.utem.edu.my/id/eprint/25386/2/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf
http://eprints.utem.edu.my/id/eprint/25386/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=119742
_version_ 1717097549269041152
spelling my.utem.eprints.253862021-11-17T08:46:55Z http://eprints.utem.edu.my/id/eprint/25386/ Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation Ullah, Inam Q Science (General) QA Mathematics Image segmentation of offline Arabic handwritten documents is an active research area but requires efforts to segment image into regions compared to human vision, especially for degraded handwritten historical documents. Therefore, these valuable degraded handwritten documents attract researchers from all around the world but facing problems in segmentation of Arabic text because of overlapping and touching character. The overlapping and touching of character occurs by not following the standard rule of writing where, two or more characters share the same space and these touching characters are considered as one sub-word. At present many techniques are available for touching handwritten character segmentation by using the concept of connected components. These methods are easy to implement and provide high accuracy in some cases but they fail in many cases because some manual decision value is required to determine the correct segmentation path near junction point, which produce unstable character boundary. Besides, these methods are unstable when applied to handwritten characters having loops or circular path in both touching characters. In this case, the cut-point is located in incorrect place, which can lead to incorrect dividing path of a character boundary. The selection of path near junction point is one of the main challenge in segmentation of connected components. Currently, these methods contain many disadvantages usually implemented for only one layout and fonts types because of variation in writing. Apart from connected components methods, template based segmentation is another available method where several studies have been developed based on template creation for touching characters. The disadvantage is creating many templates for all possible touching types. Therefore, due to variation in writing connected components methods still unexplored especially for the cursive based handwriting like Arabic and Jawi. In this work, three objectives are highlighted, first is to identify junction point of touching image, second is to formulate direction near junction point and third is for segmentation of touching characters. The research methodology consists of three proposed ideas: junction point detection, formulate direction and segmentation stage. In junction point identification stage overlapping set theory is used to identify the segmentation point of the two touching characters. In formulate direction stage; gradient technique is used to formulate the right direction near junction point. In segmentation stage contour tracing technique is used to segment the two touching character into isolated characters. The three proposed methods were tested on IFN/ENIT, AHDB and IAM datasets. Experiments were conducted on finding of junction point where success rate is 93.3%, for the second proposed method, the success rate is 98% and last proposed segmentation method is 97.27%. In conclusion, the proposed segmentation method outperforms the existing research in term of accuracy. Proposed methods do not use any recognizer or template to control segmentation accuracy. Finally, the proposed segmentation method was again compared with state of the art methods, and it also gained better accuracy rate for degraded, non-degraded document images and the accuracy for the overall processes for AHDB is about 97.45% and 85.03% for IAM dataset. 2020 Thesis NonPeerReviewed text en http://eprints.utem.edu.my/id/eprint/25386/1/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf text en http://eprints.utem.edu.my/id/eprint/25386/2/Segmentation%20Of%20Two%20Touching%20Handwritten%20Arabic%20Characters%20Using%20Overlapping%20Set%20Theory%20And%20Gradient%20Orientation.pdf Ullah, Inam (2020) Segmentation Of Two Touching Handwritten Arabic Characters Using Overlapping Set Theory And Gradient Orientation. Doctoral thesis, Universiti Teknikal Malaysia Melaka. https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=119742
score 13.211869