A new swarm-based framework for handwritten authorship identification in forensic document analysis

Feature selection has become the focus of research area for a long time due to immense consumption of high-dimensional data. Originally, the purpose of feature selection is to select the minimally sized subset of features class distribution which is as close as possible to original class distributio...

Full description

Saved in:
Bibliographic Details
Main Authors: Pratama, Satrya Fajri, Draman @ Muda, Azah Kamilah, Choo, Yun Huoy, Draman @ Muda, Noor Azilah
Format: Book Chapter
Language:en
Published: Springer International Publishing 2014
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/20003/2/CIDF_Chp2.pdf
http://eprints.utem.edu.my/id/eprint/20003/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832717410669428736
author Pratama, Satrya Fajri
Draman @ Muda, Azah Kamilah
Choo, Yun Huoy
Draman @ Muda, Noor Azilah
author_facet Pratama, Satrya Fajri
Draman @ Muda, Azah Kamilah
Choo, Yun Huoy
Draman @ Muda, Noor Azilah
author_sort Pratama, Satrya Fajri
building UTEM Library
collection Institutional Repository
content_provider Universiti Teknikal Malaysia Melaka
content_source UTEM Institutional Repository
continent Asia
country Malaysia
description Feature selection has become the focus of research area for a long time due to immense consumption of high-dimensional data. Originally, the purpose of feature selection is to select the minimally sized subset of features class distribution which is as close as possible to original class distribution.However in this chapter, feature selection is used to obtain the unique individual significant features which are proven very important in handwriting analysis of Writer Identification domain. Writer Identification is one of the areas in pattern recognition that have created a center of attention by many researchers to work in due to the extensive exchange of paper documents. Its principal point is in forensics and biometric application as such the writing style can be used as bio-metric features for authenticating the identity of a writer. Handwriting style is a personal to individual and it is implicitly represented by unique individual significant features that are hidden in individual’s handwriting. These unique features can be used to identify the handwritten authorship accordingly. The use of feature selection as one of the important machine learning task is often disregarded in Writer Identification domain, with only a handful of studies implemented feature selection phase. The key concern in Writer Identification is in acquiring the features reflecting the author of handwriting. Thus, it is an open question whether the extracted features are optimal or near-optimal to identify the author. Therefore, feature extraction and selection of the unique individual significant features are very important in order to identify the writer, moreover to improve the classification accuracy. It relates to invarianceness of authorship where invarianceness between features for intra-class (same writer) is lower than inter-class (different writer). Many researches have been done to develop algorithms for extracting good features that can reflect the authorship with good performance. This chapter instead focuses on identifying the unique individual significant features of word shape by using feature selection method prior the identification task. In this chapter, feature selection is explored in order to find the most unique individual significant features which are the unique features of individual’s writing. This chapter focuses on the integration of Swarm Optimized and Computationally Inexpensive Floating Selection (SOCIFS) feature selection technique into the proposed hybrid of Writer Identification framework and feature selection framework, namely Cheap Computational Cost Class Specific Swarm Sequential Selection (C4S4). Experiments conducted to proof the validity and feasibility of the proposed framework using dataset from IAM Database by comparing the proposed framework to the existing Writer Identification framework and various feature selection techniques and frameworks yield satisfactory results. The results show the proposed framework produces the best result with 99.35% classification accuracy. The promising outcomes are opening the gate to future explorations in Writer Identification domain specifically and other domains generally.
format Book Chapter
id my.utem.eprints-20003
institution Universiti Teknikal Malaysia Melaka
language en
publishDate 2014
publisher Springer International Publishing
record_format eprints
spelling my.utem.eprints-200032023-08-17T11:39:02Z http://eprints.utem.edu.my/id/eprint/20003/ A new swarm-based framework for handwritten authorship identification in forensic document analysis Pratama, Satrya Fajri Draman @ Muda, Azah Kamilah Choo, Yun Huoy Draman @ Muda, Noor Azilah T Technology (General) TA Engineering (General). Civil engineering (General) Feature selection has become the focus of research area for a long time due to immense consumption of high-dimensional data. Originally, the purpose of feature selection is to select the minimally sized subset of features class distribution which is as close as possible to original class distribution.However in this chapter, feature selection is used to obtain the unique individual significant features which are proven very important in handwriting analysis of Writer Identification domain. Writer Identification is one of the areas in pattern recognition that have created a center of attention by many researchers to work in due to the extensive exchange of paper documents. Its principal point is in forensics and biometric application as such the writing style can be used as bio-metric features for authenticating the identity of a writer. Handwriting style is a personal to individual and it is implicitly represented by unique individual significant features that are hidden in individual’s handwriting. These unique features can be used to identify the handwritten authorship accordingly. The use of feature selection as one of the important machine learning task is often disregarded in Writer Identification domain, with only a handful of studies implemented feature selection phase. The key concern in Writer Identification is in acquiring the features reflecting the author of handwriting. Thus, it is an open question whether the extracted features are optimal or near-optimal to identify the author. Therefore, feature extraction and selection of the unique individual significant features are very important in order to identify the writer, moreover to improve the classification accuracy. It relates to invarianceness of authorship where invarianceness between features for intra-class (same writer) is lower than inter-class (different writer). Many researches have been done to develop algorithms for extracting good features that can reflect the authorship with good performance. This chapter instead focuses on identifying the unique individual significant features of word shape by using feature selection method prior the identification task. In this chapter, feature selection is explored in order to find the most unique individual significant features which are the unique features of individual’s writing. This chapter focuses on the integration of Swarm Optimized and Computationally Inexpensive Floating Selection (SOCIFS) feature selection technique into the proposed hybrid of Writer Identification framework and feature selection framework, namely Cheap Computational Cost Class Specific Swarm Sequential Selection (C4S4). Experiments conducted to proof the validity and feasibility of the proposed framework using dataset from IAM Database by comparing the proposed framework to the existing Writer Identification framework and various feature selection techniques and frameworks yield satisfactory results. The results show the proposed framework produces the best result with 99.35% classification accuracy. The promising outcomes are opening the gate to future explorations in Writer Identification domain specifically and other domains generally. Springer International Publishing 2014 Book Chapter PeerReviewed text en http://eprints.utem.edu.my/id/eprint/20003/2/CIDF_Chp2.pdf Pratama, Satrya Fajri and Draman @ Muda, Azah Kamilah and Choo, Yun Huoy and Draman @ Muda, Noor Azilah (2014) A new swarm-based framework for handwritten authorship identification in forensic document analysis. In: Computational Intelligence In Digital Forensics : Forensic Investigation And Applications. Studies in Computational Intelligence, 555 . Springer International Publishing, Switzerland, pp. 385-411. ISBN 9783319058849
spellingShingle T Technology (General)
TA Engineering (General). Civil engineering (General)
Pratama, Satrya Fajri
Draman @ Muda, Azah Kamilah
Choo, Yun Huoy
Draman @ Muda, Noor Azilah
A new swarm-based framework for handwritten authorship identification in forensic document analysis
title A new swarm-based framework for handwritten authorship identification in forensic document analysis
title_full A new swarm-based framework for handwritten authorship identification in forensic document analysis
title_fullStr A new swarm-based framework for handwritten authorship identification in forensic document analysis
title_full_unstemmed A new swarm-based framework for handwritten authorship identification in forensic document analysis
title_short A new swarm-based framework for handwritten authorship identification in forensic document analysis
title_sort new swarm-based framework for handwritten authorship identification in forensic document analysis
topic T Technology (General)
TA Engineering (General). Civil engineering (General)
url http://eprints.utem.edu.my/id/eprint/20003/2/CIDF_Chp2.pdf
http://eprints.utem.edu.my/id/eprint/20003/
url_provider http://eprints.utem.edu.my/