Staff View: On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine

On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine

The advances in artificial intelligence and machine learning concerning emotion recognition have been enormous and in previously inconceivable ways. Inspired by the promising evolution in human-computer interaction, this paper is based on developing a multimodal emotion recognition system. This rese...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ashraf, Arselan, Gunawan, Teddy Surya, Arifin, Fatchul, Kartiwi, Mira, Sophian, Ali, Habaebi, Mohamed Hadi
Format:	Article
Language:	English
Published:	Institute of Advanced Engineering and Science (IAES) 2022
Subjects:	TK7885 Computer engineering
Online Access:	http://irep.iium.edu.my/100378/7/100378_On%20the%20audio-visual%20emotion%20recognition.pdf http://irep.iium.edu.my/100378/ http://section.iaesonline.com/index.php/IJEEI/article/view/3879/771 http://dx.doi.org/10.52549/ijeei.v10i3.3879
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.iium.irep.100378
record_format	dspace
spelling	my.iium.irep.1003782022-10-03T03:20:08Z http://irep.iium.edu.my/100378/ On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine Ashraf, Arselan Gunawan, Teddy Surya Arifin, Fatchul Kartiwi, Mira Sophian, Ali Habaebi, Mohamed Hadi TK7885 Computer engineering The advances in artificial intelligence and machine learning concerning emotion recognition have been enormous and in previously inconceivable ways. Inspired by the promising evolution in human-computer interaction, this paper is based on developing a multimodal emotion recognition system. This research encompasses two modalities as input, namely speech and video. In the proposed model, the input video samples are subjected to image pre-processing and image frames are obtained. The signal is pre-processed and transformed into the frequency domain for the audio input. The aim is to obtain Mel-spectrogram, which is processed further as images. Convolutional neural networks are used for training and feature extraction for both audio and video with different configurations. The fusion of outputs from two CNNs is done using two extreme learning machines. For classification, the proposed system incorporates a support vector machine. The model is evaluated using three databases, namely eNTERFACE, RML, and SAVEE. For the eNTERFACE dataset, the accuracy obtained without and with augmentation was 87.2% and 94.91%, respectively. The RML dataset yielded an accuracy of 98.5%, and for the SAVEE dataset, the accuracy reached 97.77%. Results achieved from this research are an illustration of the fruitful exploration and effectiveness of the proposed system. Institute of Advanced Engineering and Science (IAES) 2022-09 Article PeerReviewed application/pdf en http://irep.iium.edu.my/100378/7/100378_On%20the%20audio-visual%20emotion%20recognition.pdf Ashraf, Arselan and Gunawan, Teddy Surya and Arifin, Fatchul and Kartiwi, Mira and Sophian, Ali and Habaebi, Mohamed Hadi (2022) On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine. Indonesian Journal of Electrical Engineering and Informatics (IJEEI), 10 (3). pp. 684-697. E-ISSN 2089-3272 http://section.iaesonline.com/index.php/IJEEI/article/view/3879/771 http://dx.doi.org/10.52549/ijeei.v10i3.3879
institution	Universiti Islam Antarabangsa Malaysia
building	IIUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	International Islamic University Malaysia
content_source	IIUM Repository (IREP)
url_provider	http://irep.iium.edu.my/
language	English
topic	TK7885 Computer engineering
spellingShingle	TK7885 Computer engineering Ashraf, Arselan Gunawan, Teddy Surya Arifin, Fatchul Kartiwi, Mira Sophian, Ali Habaebi, Mohamed Hadi On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
description	The advances in artificial intelligence and machine learning concerning emotion recognition have been enormous and in previously inconceivable ways. Inspired by the promising evolution in human-computer interaction, this paper is based on developing a multimodal emotion recognition system. This research encompasses two modalities as input, namely speech and video. In the proposed model, the input video samples are subjected to image pre-processing and image frames are obtained. The signal is pre-processed and transformed into the frequency domain for the audio input. The aim is to obtain Mel-spectrogram, which is processed further as images. Convolutional neural networks are used for training and feature extraction for both audio and video with different configurations. The fusion of outputs from two CNNs is done using two extreme learning machines. For classification, the proposed system incorporates a support vector machine. The model is evaluated using three databases, namely eNTERFACE, RML, and SAVEE. For the eNTERFACE dataset, the accuracy obtained without and with augmentation was 87.2% and 94.91%, respectively. The RML dataset yielded an accuracy of 98.5%, and for the SAVEE dataset, the accuracy reached 97.77%. Results achieved from this research are an illustration of the fruitful exploration and effectiveness of the proposed system.
format	Article
author	Ashraf, Arselan Gunawan, Teddy Surya Arifin, Fatchul Kartiwi, Mira Sophian, Ali Habaebi, Mohamed Hadi
author_facet	Ashraf, Arselan Gunawan, Teddy Surya Arifin, Fatchul Kartiwi, Mira Sophian, Ali Habaebi, Mohamed Hadi
author_sort	Ashraf, Arselan
title	On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
title_short	On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
title_full	On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
title_fullStr	On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
title_full_unstemmed	On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
title_sort	on the audio-visual emotion recognition using convolutional neural networks and extreme learning machine
publisher	Institute of Advanced Engineering and Science (IAES)
publishDate	2022
url	http://irep.iium.edu.my/100378/7/100378_On%20the%20audio-visual%20emotion%20recognition.pdf http://irep.iium.edu.my/100378/ http://section.iaesonline.com/index.php/IJEEI/article/view/3879/771 http://dx.doi.org/10.52549/ijeei.v10i3.3879
_version_	1746210127459909632
score	13.211869

On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine

Similar Items