Cultural dependency analysis for understanding speech emotion
Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction wit...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2012
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf http://irep.iium.edu.my/8562/ http://www.sciencedirect.com/science/article/pii/S0957417411015715 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.iium.irep.8562 |
---|---|
record_format |
dspace |
spelling |
my.iium.irep.85622012-02-09T06:06:33Z http://irep.iium.edu.my/8562/ Cultural dependency analysis for understanding speech emotion Kamaruddin, Norhaslinda Abdul Rahman, Abdul Wahab Quek, Chai TK7885 Computer engineering Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction with machines and robots are still far from ideal. It is acknowledged that human can communicate effectively with each other through the telephony system. This situation motivates many researchers to study in depth the human communication system, with emphasis on its ability to express and infer emotion for effective social communication. Understanding the interlocutors’ emotion and recognizing the listeners’ perception is the key to boost communication effectiveness and interaction. Nonetheless, the perceived emotion is subjective and very much dependent on culture, environment and the pre-emotional state of the listener. Attempts have been made to understand the influence of culture in speech emotion and researchers have reported mixed findings that lead us to believe there are some common acoustical characteristics that enable similar emotion to be discriminated universally across culture. Yet there are unique speech attributes that facilitate exclusive emotion recognition of a particular culture. Understanding culture dependency is thus important to the performance of the speech emotion recognition system. In this paper three different speech emotion databases; namely: Berlin Emo-db, NTU_American and NTU_Asian dataset were selected to represent three different cultures of European, American and Asian respectively focusing on three basic emotions of anger, happiness and sadness with neutral acting as a reference. Different data arrangements with accordance to varying degree of culture dependency were designed for the experimental setup to provide better understanding of inter-cultural and intra-cultural effect in recognizing the speech emotion. Features were extracted using Mel Frequency Cepstral Co-effi- cient (MFCC) method and classified with neural network (Multi Layer Perceptron (MLP)) and fuzzy neural networks; namely: Adaptive Network Fuzzy Inference System (ANFIS) and Generic Self-Organizing Fuzzy Neural Network (GenSOFNN) representing precise and linguistic fuzzy rule conjuncts respectively. From the experimental results, it can be observed that culture influences the speech emotion recognition accuracy. 75% accuracy performance was recorded for generalized homogeneous intra-cultural experiments whereas the accuracy performance dropped to almost as low as chance probability (25% for 4 classes) for both homogeneous and heterogeneous mixed-cultural inter-culture experiments. The two-stage culture-sensitive speech emotion recognition approach was subsequently proposed to discriminate culture and speech emotion. Results of the analysis show potential of using the proposed technique to recognize culture-influenced speech emotion, which can be extended in many applications, for instance call center and intelligent vehicle. Such analysis may help us to better understand the culture dependency of speech emotion and as a result the accuracy performance of the speech emotion recognition system can be boosted. Elsevier 2012 Article REM application/pdf en http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf Kamaruddin, Norhaslinda and Abdul Rahman, Abdul Wahab and Quek, Chai (2012) Cultural dependency analysis for understanding speech emotion. Expert Systems with Applications, 39. pp. 5115-5113. ISSN 0957-4174 http://www.sciencedirect.com/science/article/pii/S0957417411015715 |
institution |
Universiti Islam Antarabangsa Malaysia |
building |
IIUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
International Islamic University Malaysia |
content_source |
IIUM Repository (IREP) |
url_provider |
http://irep.iium.edu.my/ |
language |
English |
topic |
TK7885 Computer engineering |
spellingShingle |
TK7885 Computer engineering Kamaruddin, Norhaslinda Abdul Rahman, Abdul Wahab Quek, Chai Cultural dependency analysis for understanding speech emotion |
description |
Speech has been one of the major communication medium for years and will continue to do so until video
communication becomes widely available and easily accessible. Although numerous technologies have
been developed to improve the effectiveness of speech communication system, human interaction with
machines and robots are still far from ideal. It is acknowledged that human can communicate effectively
with each other through the telephony system. This situation motivates many researchers to study in
depth the human communication system, with emphasis on its ability to express and infer emotion
for effective social communication. Understanding the interlocutors’ emotion and recognizing the listeners’ perception is the key to boost communication effectiveness and interaction. Nonetheless, the perceived emotion is subjective and very much dependent on culture, environment and the pre-emotional
state of the listener. Attempts have been made to understand the influence of culture in speech emotion
and researchers have reported mixed findings that lead us to believe there are some common acoustical
characteristics that enable similar emotion to be discriminated universally across culture. Yet there are
unique speech attributes that facilitate exclusive emotion recognition of a particular culture. Understanding culture dependency is thus important to the performance of the speech emotion recognition system.
In this paper three different speech emotion databases; namely: Berlin Emo-db, NTU_American and
NTU_Asian dataset were selected to represent three different cultures of European, American and Asian
respectively focusing on three basic emotions of anger, happiness and sadness with neutral acting as a
reference. Different data arrangements with accordance to varying degree of culture dependency were
designed for the experimental setup to provide better understanding of inter-cultural and intra-cultural
effect in recognizing the speech emotion. Features were extracted using Mel Frequency Cepstral Co-effi-
cient (MFCC) method and classified with neural network (Multi Layer Perceptron (MLP)) and fuzzy neural
networks; namely: Adaptive Network Fuzzy Inference System (ANFIS) and Generic Self-Organizing Fuzzy
Neural Network (GenSOFNN) representing precise and linguistic fuzzy rule conjuncts respectively. From
the experimental results, it can be observed that culture influences the speech emotion recognition accuracy. 75% accuracy performance was recorded for generalized homogeneous intra-cultural experiments
whereas the accuracy performance dropped to almost as low as chance probability (25% for 4 classes)
for both homogeneous and heterogeneous mixed-cultural inter-culture experiments. The two-stage culture-sensitive speech emotion recognition approach was subsequently proposed to discriminate culture
and speech emotion. Results of the analysis show potential of using the proposed technique to recognize
culture-influenced speech emotion, which can be extended in many applications, for instance call center
and intelligent vehicle. Such analysis may help us to better understand the culture dependency of speech
emotion and as a result the accuracy performance of the speech emotion recognition system can be
boosted. |
format |
Article |
author |
Kamaruddin, Norhaslinda Abdul Rahman, Abdul Wahab Quek, Chai |
author_facet |
Kamaruddin, Norhaslinda Abdul Rahman, Abdul Wahab Quek, Chai |
author_sort |
Kamaruddin, Norhaslinda |
title |
Cultural dependency analysis for understanding speech emotion |
title_short |
Cultural dependency analysis for understanding speech emotion |
title_full |
Cultural dependency analysis for understanding speech emotion |
title_fullStr |
Cultural dependency analysis for understanding speech emotion |
title_full_unstemmed |
Cultural dependency analysis for understanding speech emotion |
title_sort |
cultural dependency analysis for understanding speech emotion |
publisher |
Elsevier |
publishDate |
2012 |
url |
http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf http://irep.iium.edu.my/8562/ http://www.sciencedirect.com/science/article/pii/S0957417411015715 |
_version_ |
1643606149777326080 |
score |
13.211869 |