Exploring the impact of integrated AWE and generative AI feedback on Chinese EFL undergraduates’ higher-order thinking in argumentative writing

Although automated writing evaluation (AWE) and artificial intelligence (AI) tools have been widely practiced in EFL/ESL writing instruction, there is a lack of empirical research on the effect of the integration of both AWE and AI feedback on students’ higher-order thinking (HOT). Therefore, this s...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao, Hongxia, Razali, Abu Bakar, Zuo, Ruijia
Format: Article
Language:en
Published: SAGE Publications Inc. 2026
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/122707/1/122707.pdf
http://psasir.upm.edu.my/id/eprint/122707/
https://journals.sagepub.com/doi/10.1177/21582440251413884
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Although automated writing evaluation (AWE) and artificial intelligence (AI) tools have been widely practiced in EFL/ESL writing instruction, there is a lack of empirical research on the effect of the integration of both AWE and AI feedback on students’ higher-order thinking (HOT). Therefore, this study is to explore the impact of integrating AWE and AI feedback on Chinese EFL undergraduates’ higher-order thinking (HOT) in argumentative writing based on Revised Bloom’s Taxonomy and Cognitive Feedback Theory. Pre- and post-tests and semi-structured interviews were used to study 64 third-year students in the English major at a Chinese public university for 16 weeks. The experimental group (n = 32) received AWE (Pigai) and AI (ChatGPT) feedback, while the control group (n = 32) received only AWE (Pigai) feedback. Quantitative results showed that EG students had significant improvements in higher-order thinking (HOT; analysis, evaluation, and creation; p < .001) with a high effect size (d > 0.80), while the CG students had a smaller improvement (d > 0.15). ANOVA confirmed that analysis had the highest effect size (p < .001, η2 = .862), followed by evaluation (p < .001, η2 = .818) and creation (p < .001, η2 = .812). Qualitative results showed that AWE and AI tools were complementary, in which AWE could help students correct superficial language errors, but AI could improve students’ higher-order thinking (HOT) in analysis, evaluation, and creation. They can focus on language and higher-order thinking (HOT) and optimized revision strategies. However, students also faced problems in understanding feedback and over-reliance on it.