Using the short-time fourier transform and ResNet to diagnose depression from speech data
Depression is a common illness that is affecting many people nowadays, this is especially true now with the advent of the COVID-19 pandemic. It often arises when a person is having difficulty coping with stressful life events. It can occur throughout the lifespan of a person, and it pervades al...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English English |
Published: |
IEEE
2021
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/97108/1/97108_Using%20the%20short-time%20fourier%20transform_Scopus.pdf http://irep.iium.edu.my/97108/2/97108_Using%20the%20short-time.pdf http://irep.iium.edu.my/97108/ https://ieeexplore.ieee.org/document/9673562 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.iium.irep.97108 |
---|---|
record_format |
dspace |
spelling |
my.iium.irep.971082022-03-09T03:52:21Z http://irep.iium.edu.my/97108/ Using the short-time fourier transform and ResNet to diagnose depression from speech data Elfaki, Ayman Asnawi, Ani Liza Jusoh, Ahmad Zamani Ismail, Ahmad Fadzil Ibrahim, Siti Noorjannah Mohamed Azmin, Nor Fadhillah Nik Hashim, Nik Nur Wahidah T Technology (General) TK Electrical engineering. Electronics Nuclear engineering Depression is a common illness that is affecting many people nowadays, this is especially true now with the advent of the COVID-19 pandemic. It often arises when a person is having difficulty coping with stressful life events. It can occur throughout the lifespan of a person, and it pervades all aspects of our lives. Currently, depression diagnoses rely on patient interviews and self-report questionnaires, which depend heavily on the patient honesty and the subjective experience of the clinician. In this paper, we will begin with investigating the viability of using the Short-Time Fourier Transform (STFT) as a feature descriptor to objectively diagnose depression from speech data. The dataset used in this research is the Audio-Visual Emotion Challenging 2017 (AVEC2017). The model is based on a modified ResNet18 model architecture to perform a binary classification (i.e., depressed or non-depressed). The STFT is computed from the speech signal to generate a mel-spectrogram for training and testing the model. The experiment shows that relying solely on STFT as an input feature resulted in an F1 score of 74.71% in classifying depression. IEEE 2021 Conference or Workshop Item PeerReviewed application/pdf en http://irep.iium.edu.my/97108/1/97108_Using%20the%20short-time%20fourier%20transform_Scopus.pdf application/pdf en http://irep.iium.edu.my/97108/2/97108_Using%20the%20short-time.pdf Elfaki, Ayman and Asnawi, Ani Liza and Jusoh, Ahmad Zamani and Ismail, Ahmad Fadzil and Ibrahim, Siti Noorjannah and Mohamed Azmin, Nor Fadhillah and Nik Hashim, Nik Nur Wahidah (2021) Using the short-time fourier transform and ResNet to diagnose depression from speech data. In: 2021 IEEE International Conference on Computing, ICOCO 2021, 17 - 19 November 2021, Virtual. https://ieeexplore.ieee.org/document/9673562 10.1109/ICOCO53166.2021.9673562 |
institution |
Universiti Islam Antarabangsa Malaysia |
building |
IIUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
International Islamic University Malaysia |
content_source |
IIUM Repository (IREP) |
url_provider |
http://irep.iium.edu.my/ |
language |
English English |
topic |
T Technology (General) TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
T Technology (General) TK Electrical engineering. Electronics Nuclear engineering Elfaki, Ayman Asnawi, Ani Liza Jusoh, Ahmad Zamani Ismail, Ahmad Fadzil Ibrahim, Siti Noorjannah Mohamed Azmin, Nor Fadhillah Nik Hashim, Nik Nur Wahidah Using the short-time fourier transform and ResNet to diagnose depression from speech data |
description |
Depression is a common illness that is affecting
many people nowadays, this is especially true now with the
advent of the COVID-19 pandemic. It often arises when a
person is having difficulty coping with stressful life events. It
can occur throughout the lifespan of a person, and it pervades
all aspects of our lives. Currently, depression diagnoses rely on
patient interviews and self-report questionnaires, which depend
heavily on the patient honesty and the subjective experience
of the clinician. In this paper, we will begin with investigating
the viability of using the Short-Time Fourier Transform (STFT)
as a feature descriptor to objectively diagnose depression from
speech data. The dataset used in this research is the Audio-Visual
Emotion Challenging 2017 (AVEC2017). The model is based on
a modified ResNet18 model architecture to perform a binary
classification (i.e., depressed or non-depressed). The STFT is
computed from the speech signal to generate a mel-spectrogram
for training and testing the model. The experiment shows that
relying solely on STFT as an input feature resulted in an F1
score of 74.71% in classifying depression. |
format |
Conference or Workshop Item |
author |
Elfaki, Ayman Asnawi, Ani Liza Jusoh, Ahmad Zamani Ismail, Ahmad Fadzil Ibrahim, Siti Noorjannah Mohamed Azmin, Nor Fadhillah Nik Hashim, Nik Nur Wahidah |
author_facet |
Elfaki, Ayman Asnawi, Ani Liza Jusoh, Ahmad Zamani Ismail, Ahmad Fadzil Ibrahim, Siti Noorjannah Mohamed Azmin, Nor Fadhillah Nik Hashim, Nik Nur Wahidah |
author_sort |
Elfaki, Ayman |
title |
Using the short-time fourier transform and ResNet to diagnose depression from speech data |
title_short |
Using the short-time fourier transform and ResNet to diagnose depression from speech data |
title_full |
Using the short-time fourier transform and ResNet to diagnose depression from speech data |
title_fullStr |
Using the short-time fourier transform and ResNet to diagnose depression from speech data |
title_full_unstemmed |
Using the short-time fourier transform and ResNet to diagnose depression from speech data |
title_sort |
using the short-time fourier transform and resnet to diagnose depression from speech data |
publisher |
IEEE |
publishDate |
2021 |
url |
http://irep.iium.edu.my/97108/1/97108_Using%20the%20short-time%20fourier%20transform_Scopus.pdf http://irep.iium.edu.my/97108/2/97108_Using%20the%20short-time.pdf http://irep.iium.edu.my/97108/ https://ieeexplore.ieee.org/document/9673562 |
_version_ |
1728051162866778112 |
score |
13.211869 |