Depression Detection on Mandarin Text through Bert Model

Depression is currently one of the most prevalent mental disorders and its incidence has been rising significantly in Malaysia amid the Covid-19 pandemic. While previous studies have demonstrated the potential of artificial intelligence technology in analysing social media texts to detect signs of d...

Full description

Saved in:
Bibliographic Details
Main Authors: Yung, Teck Kiong, Cheah, Wai Shiang, Mahir, Perdana, Hamizan, Sharbini, Iwan Tri, Riyadi Yanto
Format: Article
Language:English
Published: Semarak Ilmu Publishing 2024
Subjects:
Online Access:http://ir.unimas.my/id/eprint/47245/1/ARASETV60_N2_PP295311.pdf
http://ir.unimas.my/id/eprint/47245/
https://semarakilmu.com.my/journals/index.php/applied_sciences_eng_tech/article/view/4716
https://doi.org/10.37934/araset.60.2.295311
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.unimas.ir-47245
record_format eprints
spelling my.unimas.ir-472452025-01-03T06:40:34Z http://ir.unimas.my/id/eprint/47245/ Depression Detection on Mandarin Text through Bert Model Yung, Teck Kiong Cheah, Wai Shiang Mahir, Perdana Hamizan, Sharbini Iwan Tri, Riyadi Yanto QA76 Computer software Depression is currently one of the most prevalent mental disorders and its incidence has been rising significantly in Malaysia amid the Covid-19 pandemic. While previous studies have demonstrated the potential of artificial intelligence technology in analysing social media texts to detect signs of depression, most of these studies have focused on English textual content. Considering that Mandarin is the second most widely spoken language worldwide, it is worthwhile to explore depression detection techniques specifically tailored for Mandarin textual content. This research aims to examine the effectiveness of the BERT model in text classification, particularly for detecting depression in Mandarin. The study proposes the utilization of the BERT model to analyse social media posts related to depression. The model is trained using the WU3D dataset, which comprises a collection of over 2 million text data sourced from Sina Weibo, a prominent Chinese social media platform. Given the dataset's inherent imbalance, text augmentation techniques were employed to assess whether they contribute to improved model performance. The findings suggest that the BERT model trained on the original dataset outperformed the model trained on the augmented dataset. This implies that the BERT model is well-equipped to handle imbalanced datasets effectively. Furthermore, it is speculated that the augmented dataset did not introduce novel information or knowledge during the model training process. Notably, the highest-performing model achieved an impressive accuracy rate of 88% on the testing dataset. Semarak Ilmu Publishing 2024 Article PeerReviewed text en http://ir.unimas.my/id/eprint/47245/1/ARASETV60_N2_PP295311.pdf Yung, Teck Kiong and Cheah, Wai Shiang and Mahir, Perdana and Hamizan, Sharbini and Iwan Tri, Riyadi Yanto (2024) Depression Detection on Mandarin Text through Bert Model. Journal of Advanced Research in Applied Sciences and Engineering Technology, 60 (2). pp. 295-311. ISSN 2462-1943 https://semarakilmu.com.my/journals/index.php/applied_sciences_eng_tech/article/view/4716 https://doi.org/10.37934/araset.60.2.295311
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic QA76 Computer software
spellingShingle QA76 Computer software
Yung, Teck Kiong
Cheah, Wai Shiang
Mahir, Perdana
Hamizan, Sharbini
Iwan Tri, Riyadi Yanto
Depression Detection on Mandarin Text through Bert Model
description Depression is currently one of the most prevalent mental disorders and its incidence has been rising significantly in Malaysia amid the Covid-19 pandemic. While previous studies have demonstrated the potential of artificial intelligence technology in analysing social media texts to detect signs of depression, most of these studies have focused on English textual content. Considering that Mandarin is the second most widely spoken language worldwide, it is worthwhile to explore depression detection techniques specifically tailored for Mandarin textual content. This research aims to examine the effectiveness of the BERT model in text classification, particularly for detecting depression in Mandarin. The study proposes the utilization of the BERT model to analyse social media posts related to depression. The model is trained using the WU3D dataset, which comprises a collection of over 2 million text data sourced from Sina Weibo, a prominent Chinese social media platform. Given the dataset's inherent imbalance, text augmentation techniques were employed to assess whether they contribute to improved model performance. The findings suggest that the BERT model trained on the original dataset outperformed the model trained on the augmented dataset. This implies that the BERT model is well-equipped to handle imbalanced datasets effectively. Furthermore, it is speculated that the augmented dataset did not introduce novel information or knowledge during the model training process. Notably, the highest-performing model achieved an impressive accuracy rate of 88% on the testing dataset.
format Article
author Yung, Teck Kiong
Cheah, Wai Shiang
Mahir, Perdana
Hamizan, Sharbini
Iwan Tri, Riyadi Yanto
author_facet Yung, Teck Kiong
Cheah, Wai Shiang
Mahir, Perdana
Hamizan, Sharbini
Iwan Tri, Riyadi Yanto
author_sort Yung, Teck Kiong
title Depression Detection on Mandarin Text through Bert Model
title_short Depression Detection on Mandarin Text through Bert Model
title_full Depression Detection on Mandarin Text through Bert Model
title_fullStr Depression Detection on Mandarin Text through Bert Model
title_full_unstemmed Depression Detection on Mandarin Text through Bert Model
title_sort depression detection on mandarin text through bert model
publisher Semarak Ilmu Publishing
publishDate 2024
url http://ir.unimas.my/id/eprint/47245/1/ARASETV60_N2_PP295311.pdf
http://ir.unimas.my/id/eprint/47245/
https://semarakilmu.com.my/journals/index.php/applied_sciences_eng_tech/article/view/4716
https://doi.org/10.37934/araset.60.2.295311
_version_ 1821007928319016960
score 13.226497