Intelligent assessment of English teachers’ classroom language interaction and emotional behaviour based on artificial intelligence

This study focuses on English teachers’ classroom language expression, emotional changes, and teacher-student interaction behaviors, and proposes an intelligent evaluation model based on multimodal representation learning—Bimodal Modality Representation and Fusion Network (BMRN). The model integrate...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Zhiming, Sulaiman, Tajularipin, Che Nawi, Nur Raihan
Format: Article
Language:en
Published: Nature Research 2025
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/122629/1/122629.pdf
http://psasir.upm.edu.my/id/eprint/122629/
https://www.nature.com/articles/s41598-025-20034-5?error=cookies_not_supported&code=c8d46b21-d6a2-4287-b4db-22d52067e675
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study focuses on English teachers’ classroom language expression, emotional changes, and teacher-student interaction behaviors, and proposes an intelligent evaluation model based on multimodal representation learning—Bimodal Modality Representation and Fusion Network (BMRN). The model integrates text, speech, and visual multimodal information, introduces a shared-private feature decoupling structure, and combines four types of losses (similarity, difference, cycle consistency, and reconstruction) to achieve cross-modal deep semantic fusion and accurate modeling. Experiments are conducted based on public datasets CMU- Multimodal Opinion-Level Sentiment Intensity (MOSI), Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI), and self-built teacher teaching data. Results show that BMRN outperforms existing advanced models in both classification and regression metrics. Specifically, on the CMU-MOSEI dataset, the mean absolute error is reduced to 0.547, and the Pearson correlation coefficient reaches 0.781; on the self-built dataset, the four-classification accuracy reaches 66.1%. Ablation experiments indicate that the text modality and cycle consistency loss contribute the most to model performance. Robustness tests verify the effectiveness of the cross-modal reconstruction mechanism in cases of modality missing. This model improves the ability of semantic understanding and emotion recognition for multimodal teaching behaviors, with strong generalization and practical application potential. This study provides strong technical support and data basis for teacher behavior evaluation and professional development in smart classrooms.