Data annotation architecture for automatic depression detection

Depression is a mood disorder that causes a person to feel sad, tired and experience a prolonged lack of energy, irritability, and loss of interest in daily activities. Many scholars have contributed in identifying and curbing depression. One of such efforts is the development of a model that can id...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang, Yun Yao, Nazlia Omar,
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2023
Online Access:http://journalarticle.ukm.my/22537/1/03%20-.pdf
http://journalarticle.ukm.my/22537/
https://www.ukm.my/apjitm/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-ukm.journal.22537
record_format eprints
spelling my-ukm.journal.225372023-11-23T03:20:07Z http://journalarticle.ukm.my/22537/ Data annotation architecture for automatic depression detection Chang, Yun Yao Nazlia Omar, Depression is a mood disorder that causes a person to feel sad, tired and experience a prolonged lack of energy, irritability, and loss of interest in daily activities. Many scholars have contributed in identifying and curbing depression. One of such efforts is the development of a model that can identify and predict depression among Twitter users. However, so far, there is no quality and labeled dataset containing depression from tweet sources. Therefore, the purpose of this study is to propose an architecture that can collect data on social media such as Twitter to detect depression automatically. This study involves text analysis that begins with data scraping, text processing, feature extraction, modeling, evaluation and followed by document corpus analysis using TF-IDF and BOW. The sentiment lexicon derived from two tools, TextBlob and Vader, was used to distinguish the emotions of words. Four machine learning classifiers i.e., Logistic Regression, Decision Tree, Support Vector Machine and K-Nearest Neighbour were used to perform the classification. The final data set management and the use of Logistic Regression produced the expected high accuracy, precision, recall and F1-Score results in predicting depression. For the application, data for Malaysia local COVID-19 tweets was scraped using TWINT. Appropriate hashtags and keywords were used to obtain tweet sentences. The results show that the proposed architecture outperforms the baseline by achieving 92.876% F1-Score through SVM+TFIDF compared to the F-Score obtained by the baseline. This shows that the proposed data annotation architecture has good performance in detecting depression. Penerbit Universiti Kebangsaan Malaysia 2023-06 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/22537/1/03%20-.pdf Chang, Yun Yao and Nazlia Omar, (2023) Data annotation architecture for automatic depression detection. Asia-Pacific Journal of Information Technology and Multimedia, 12 (1). pp. 39-56. ISSN 2289-2192 https://www.ukm.my/apjitm/
institution Universiti Kebangsaan Malaysia
building Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description Depression is a mood disorder that causes a person to feel sad, tired and experience a prolonged lack of energy, irritability, and loss of interest in daily activities. Many scholars have contributed in identifying and curbing depression. One of such efforts is the development of a model that can identify and predict depression among Twitter users. However, so far, there is no quality and labeled dataset containing depression from tweet sources. Therefore, the purpose of this study is to propose an architecture that can collect data on social media such as Twitter to detect depression automatically. This study involves text analysis that begins with data scraping, text processing, feature extraction, modeling, evaluation and followed by document corpus analysis using TF-IDF and BOW. The sentiment lexicon derived from two tools, TextBlob and Vader, was used to distinguish the emotions of words. Four machine learning classifiers i.e., Logistic Regression, Decision Tree, Support Vector Machine and K-Nearest Neighbour were used to perform the classification. The final data set management and the use of Logistic Regression produced the expected high accuracy, precision, recall and F1-Score results in predicting depression. For the application, data for Malaysia local COVID-19 tweets was scraped using TWINT. Appropriate hashtags and keywords were used to obtain tweet sentences. The results show that the proposed architecture outperforms the baseline by achieving 92.876% F1-Score through SVM+TFIDF compared to the F-Score obtained by the baseline. This shows that the proposed data annotation architecture has good performance in detecting depression.
format Article
author Chang, Yun Yao
Nazlia Omar,
spellingShingle Chang, Yun Yao
Nazlia Omar,
Data annotation architecture for automatic depression detection
author_facet Chang, Yun Yao
Nazlia Omar,
author_sort Chang, Yun Yao
title Data annotation architecture for automatic depression detection
title_short Data annotation architecture for automatic depression detection
title_full Data annotation architecture for automatic depression detection
title_fullStr Data annotation architecture for automatic depression detection
title_full_unstemmed Data annotation architecture for automatic depression detection
title_sort data annotation architecture for automatic depression detection
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2023
url http://journalarticle.ukm.my/22537/1/03%20-.pdf
http://journalarticle.ukm.my/22537/
https://www.ukm.my/apjitm/
_version_ 1783877715643858944
score 13.211869