Staff View: Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger

Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger

This paper is concerned with the application of technologies developed in other disciplines, in particular with the use of text processing techniques to investigate the problems of second language learner writing in English. The question addressed is whether learner texts produced by L1-Malay lea...

Full description

Saved in:

Bibliographic Details
Main Authors:	Roslina Abdul Aziz,, Zuraidah Mohd Don,
Format:	Article
Language:	English
Published:	Penerbit Universiti Kebangsaan Malaysia 2019
Online Access:	http://journalarticle.ukm.my/14094/1/30438-108026-1-PB.pdf http://journalarticle.ukm.my/14094/ http://ejournal.ukm.my/gema/issue/view/1212
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-ukm.journal.14094
record_format	eprints
spelling	my-ukm.journal.140942020-01-31T22:50:17Z http://journalarticle.ukm.my/14094/ Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger Roslina Abdul Aziz, Zuraidah Mohd Don, This paper is concerned with the application of technologies developed in other disciplines, in particular with the use of text processing techniques to investigate the problems of second language learner writing in English. The question addressed is whether learner texts produced by L1-Malay learners at the University of Malaya can usefully be processed using the Constituent Likelihood Automatic Word-tagging System (CLAWS); a part-of-speech (POS) tagger developed for and trained on texts written by native speakers of the language. The study adopts the procedure employed by van Rooy and Schäfer (2002).CLAWS was used to automatically POS tag a subset of the Malaysian Corpus of Learner English (MACLE), and the texts were then analyzed for tagging accuracy.CLAWS was found to perform less well on learner text than on native speaker texts, but still with an accuracy rate of over 90%. The sources of error are traced, and spelling errors are found to be the most common source. Closer inspection indicates that successful tagging is likely to lead to problems downstream in later processing, which suggests that to optimize performance, some modifications will be required in tagger design. Penerbit Universiti Kebangsaan Malaysia 2019-08 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/14094/1/30438-108026-1-PB.pdf Roslina Abdul Aziz, and Zuraidah Mohd Don, (2019) Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger. GEMA: Online Journal of Language Studies, 19 (3). pp. 140-155. ISSN 1675-8021 http://ejournal.ukm.my/gema/issue/view/1212
institution	Universiti Kebangsaan Malaysia
building	Tun Sri Lanang Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Kebangsaan Malaysia
content_source	UKM Journal Article Repository
url_provider	http://journalarticle.ukm.my/
language	English
description	This paper is concerned with the application of technologies developed in other disciplines, in particular with the use of text processing techniques to investigate the problems of second language learner writing in English. The question addressed is whether learner texts produced by L1-Malay learners at the University of Malaya can usefully be processed using the Constituent Likelihood Automatic Word-tagging System (CLAWS); a part-of-speech (POS) tagger developed for and trained on texts written by native speakers of the language. The study adopts the procedure employed by van Rooy and Schäfer (2002).CLAWS was used to automatically POS tag a subset of the Malaysian Corpus of Learner English (MACLE), and the texts were then analyzed for tagging accuracy.CLAWS was found to perform less well on learner text than on native speaker texts, but still with an accuracy rate of over 90%. The sources of error are traced, and spelling errors are found to be the most common source. Closer inspection indicates that successful tagging is likely to lead to problems downstream in later processing, which suggests that to optimize performance, some modifications will be required in tagger design.
format	Article
author	Roslina Abdul Aziz, Zuraidah Mohd Don,
spellingShingle	Roslina Abdul Aziz, Zuraidah Mohd Don, Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
author_facet	Roslina Abdul Aziz, Zuraidah Mohd Don,
author_sort	Roslina Abdul Aziz,
title	Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
title_short	Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
title_full	Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
title_fullStr	Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
title_full_unstemmed	Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger
title_sort	tagging l2 writing: learner errors and the performance of an automated part-of-speech tagger
publisher	Penerbit Universiti Kebangsaan Malaysia
publishDate	2019
url	http://journalarticle.ukm.my/14094/1/30438-108026-1-PB.pdf http://journalarticle.ukm.my/14094/ http://ejournal.ukm.my/gema/issue/view/1212
_version_	1657565467026391040
score	13.211869

Tagging L2 writing: learner errors and the performance of an automated part-of-speech tagger

Similar Items