Can ChatGPT translate like a pro? a pilot benchmarking study of English–Malay translation quality

Artificial intelligence (AI) tools such as ChatGPT have significantly advanced machine translation, yet their performance in low-resource language pairs, particularly English–Malay, lags behind. While existing studies have compared AI and human translation quality, most have relied on academic asses...

Full description

Saved in:

Bibliographic Details
Main Authors:	M. Zain Sulaiman, Intan Safinaz Zainudin, Haslina Haroon
Format:	Article
Language:	en
Published:	Penerbit Universiti Kebangsaan Malaysia 2025
Online Access:	http://journalarticle.ukm.my/26639/1/TDB%2017.pdf http://journalarticle.ukm.my/26639/ https://ejournal.ukm.my/3l/issue/view/1856
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Artificial intelligence (AI) tools such as ChatGPT have significantly advanced machine translation, yet their performance in low-resource language pairs, particularly English–Malay, lags behind. While existing studies have compared AI and human translation quality, most have relied on academic assessment frameworks, leaving a gap in evaluating AI translation through professional certification standards. From a professional standpoint, translation competence is most reliably assessed through formal certification frameworks that combine analytic rubrics, performance descriptors, and expert judgment. To determine whether AI systems can perform at a professional standard, they must be evaluated using the same criteria applied to human translators. This pilot study addresses that gap by benchmarking ChatGPT’s English–Malay translation performance against a novice and a professional translator using the National Accreditation Authority for Translators and Interpreters (NAATI) Certified Translator examination framework. Thirteen professional raters from the Malaysian Translators Association assessed the translations based on Meaning Transfer, Textual Norms and Conventions, and Language Proficiency. Findings revealed a clear performance hierarchy—Professional Translator > ChatGPT > Novice Translator—indicating that while ChatGPT achieved near-professional competence in fluency and meaning accuracy, it remained limited in idiomatic precision and cultural adaptation. The study highlights ChatGPT’s potential as an assistive tool for translation and training, while reaffirming the need for human oversight. It also validates the NAATI framework as a robust benchmark for evaluating AI translation quality. As AI models continue to evolve, future research involving larger translator samples and a wider range of language pairs is essential to evaluate ongoing progress and ensure the responsible integration of AI translation into professional practice.

Can ChatGPT translate like a pro? a pilot benchmarking study of English–Malay translation quality

Similar Items