Stylometric authorship balanced attribution prediction method
Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosi...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2011
|
Online Access: | http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf http://psasir.upm.edu.my/id/eprint/27377/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.upm.eprints.27377 |
---|---|
record_format |
eprints |
spelling |
my.upm.eprints.273772014-02-27T00:53:54Z http://psasir.upm.edu.my/id/eprint/27377/ Stylometric authorship balanced attribution prediction method Mustafa, Tareef Kamil Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosing some attributes that shows uniquely belong to the author, assuming that each author has a special artistic way of writing that no other author has. There are two major problems that tie up the progress in this field, which are the predictions accuracy results and the human expert judgment. The techniques that manage such predictions are either using the statistical attributes such as frequent words or the use of more sophisticated semantic techniques such as lexicons. Nonetheless, the results are still considerably less accurate. In this research, we propose a new Stylometric method known as the Stylometric authorship balanced attribution (SABA) that is able to overcome these problems with higher accuracy prediction and independent from human judgments, which means that the method does not rely on the domain experts. The new method is implemented by merging three methods, which are called the computational approach, the Winnow algorithm and the Burrows-delta method. The proposed method also uses a set of more effective attributes as compared to the frequent words method. This results in higher Stylometric prediction thus far, having more alibis for author artistic writing style for authorship recognition and prediction. The effective attributes are represented by the word pair and the trio, while both are multiple words attributes. The proposed SABA method is compared against three other methods using the computational approach, the Winnow algorithm method, and the Burrows-delta method. The results showed that the proposed method produces superior prediction accuracy and even provides a completely correct result during the final stage of the experiment. 2011-08 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf Mustafa, Tareef Kamil (2011) Stylometric authorship balanced attribution prediction method. PhD thesis, Universiti Putra Malaysia. English |
institution |
Universiti Putra Malaysia |
building |
UPM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Putra Malaysia |
content_source |
UPM Institutional Repository |
url_provider |
http://psasir.upm.edu.my/ |
language |
English English |
description |
Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosing some attributes that shows uniquely belong to the author, assuming that each author has a special artistic way of writing that no other author has. There are two major problems that tie up the progress in this field, which are the predictions accuracy results and the human expert judgment. The techniques that manage such predictions are either using the statistical attributes such as frequent words or the use of more sophisticated semantic techniques such as lexicons. Nonetheless, the results are still considerably less accurate. In this research, we propose a new Stylometric method known as the Stylometric authorship balanced attribution (SABA) that is able to overcome these problems with higher accuracy prediction and independent from human judgments, which means that the method does not rely on the domain experts. The new method is implemented by merging three methods, which are called the computational approach, the Winnow algorithm and the Burrows-delta method. The proposed method also uses a set of more effective attributes as compared to the frequent words method. This results in higher Stylometric prediction thus far, having more alibis for author artistic writing style for authorship recognition and prediction. The effective attributes are represented by the word pair and the trio, while both are multiple words attributes. The proposed SABA method is compared against three other methods using the computational approach, the Winnow algorithm method, and the Burrows-delta method. The results showed that the proposed method produces superior prediction accuracy and even provides a completely correct result during the final stage of the experiment. |
format |
Thesis |
author |
Mustafa, Tareef Kamil |
spellingShingle |
Mustafa, Tareef Kamil Stylometric authorship balanced attribution prediction method |
author_facet |
Mustafa, Tareef Kamil |
author_sort |
Mustafa, Tareef Kamil |
title |
Stylometric authorship balanced attribution prediction method |
title_short |
Stylometric authorship balanced attribution prediction method |
title_full |
Stylometric authorship balanced attribution prediction method |
title_fullStr |
Stylometric authorship balanced attribution prediction method |
title_full_unstemmed |
Stylometric authorship balanced attribution prediction method |
title_sort |
stylometric authorship balanced attribution prediction method |
publishDate |
2011 |
url |
http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf http://psasir.upm.edu.my/id/eprint/27377/ |
_version_ |
1643829166797225984 |
score |
13.211869 |