Interpretable hybrid models of Kolmogorov–Arnold Networks and transformer for mental health classification in low-resource languages: a Malay social media case study
Depression, anxiety, and stress (DAS) are among the most common global mental health disorders. Social media has become a key outlet where individuals express their psychological states. This research contributes to computational linguistics and mental health informatics by enhancing the classifica...
Saved in:
| Main Author: | |
|---|---|
| Format: | Thesis |
| Language: | en |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://ir.uitm.edu.my/id/eprint/132613/1/132613.pdf https://ir.uitm.edu.my/id/eprint/132613/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Depression, anxiety, and stress (DAS) are among the most common global mental health disorders. Social media has become a key outlet where individuals express their psychological states. This research contributes to computational linguistics and mental health informatics by enhancing the classification of DAS in Malay social media, a linguistically diverse and low-resource context marked by extensive colloquial usage. The study addresses several core challenges: the lack of a gold-standard corpus, limitations in existing language models, feature overlap, class imbalance, and issues of model interpretability. A gold-standard annotated corpus is developed using a hybrid strategy that combines expert validation, community a!liations, and self-reported data to ensure reliability and cultural relevance. To address linguistic and computational limitations, this study employs a range of Natural Language Processing (NLP) techniques, including Word2Vec embeddings, Recurrent Neural Networks (RNNs) with attention mechanisms, and transformer-based models such as BERT. To mitigate class imbalance and feature overlap, novel strategies, namely the Class-Aware Attention Model (CAAM) and the Balancing Class Weight Algorithm (BCWA), are introduced, achieving a strong macro average F1-score of 0.88. Further improvement is realised through the integration of Kolmogorov-Arnold Networks (KAN) with BERT. This hybrid KAN-BERT model, enhanced with residual connections, attains a macro average F1-score of 0.92. The structured approach of KAN improves model interpretability by clarifying feature importance, thereby enhancing trust and potential usability in clinical or community mental health settings. Overall, this study delivers a validated corpus, a domain-specific language model, and innovative neural network approaches tailored for low-resource languages. These contributions significantly improve the accuracy and applicability of DAS classification in Malay-language social media, underscoring the role of NLP in addressing mental health challenges in underrepresented linguistic contexts. |
|---|
