Interpretable hybrid models of Kolmogorov–Arnold Networks and transformer for mental health classification in low-resource languages: a Malay social media case study

Depression, anxiety, and stress (DAS) are among the most common global mental health disorders. Social media has become a key outlet where individuals express their psy­chological states. This research contributes to computational linguistics and mental health informatics by enhancing the classifica...

Full description

Saved in:
Bibliographic Details
Main Author: Ahmad, Zaaba
Format: Thesis
Language:en
Published: 2025
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/132613/1/132613.pdf
https://ir.uitm.edu.my/id/eprint/132613/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Depression, anxiety, and stress (DAS) are among the most common global mental health disorders. Social media has become a key outlet where individuals express their psy­chological states. This research contributes to computational linguistics and mental health informatics by enhancing the classification of DAS in Malay social media, a linguistically diverse and low-resource context marked by extensive colloquial usage. The study addresses several core challenges: the lack of a gold-standard corpus, lim­itations in existing language models, feature overlap, class imbalance, and issues of model interpretability. A gold-standard annotated corpus is developed using a hybrid strategy that combines expert validation, community a!liations, and self-reported data to ensure reliability and cultural relevance. To address linguistic and computational lim­itations, this study employs a range of Natural Language Processing (NLP) techniques, including Word2Vec embeddings, Recurrent Neural Networks (RNNs) with attention mechanisms, and transformer-based models such as BERT. To mitigate class imbalance and feature overlap, novel strategies, namely the Class-Aware Attention Model (CAAM) and the Balancing Class Weight Algorithm (BCWA), are introduced, achieving a strong macro average F1-score of 0.88. Further improvement is realised through the inte­gration of Kolmogorov-Arnold Networks (KAN) with BERT. This hybrid KAN-BERT model, enhanced with residual connections, attains a macro average F1-score of 0.92. The structured approach of KAN improves model interpretability by clarifying feature importance, thereby enhancing trust and potential usability in clinical or community mental health settings. Overall, this study delivers a validated corpus, a domain-specific language model, and innovative neural network approaches tailored for low-resource languages. These contributions significantly improve the accuracy and applicability of DAS classification in Malay-language social media, underscoring the role of NLP in addressing mental health challenges in underrepresented linguistic contexts.