Cultural dependency analysis for understanding speech emotion

Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction wit...

Full description

Saved in:
Bibliographic Details
Main Authors: Kamaruddin, Norhaslinda, Abdul Rahman, Abdul Wahab, Quek, Chai
Format: Article
Language:English
Published: Elsevier 2012
Subjects:
Online Access:http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf
http://irep.iium.edu.my/8562/
http://www.sciencedirect.com/science/article/pii/S0957417411015715
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.iium.irep.8562
record_format dspace
spelling my.iium.irep.85622012-02-09T06:06:33Z http://irep.iium.edu.my/8562/ Cultural dependency analysis for understanding speech emotion Kamaruddin, Norhaslinda Abdul Rahman, Abdul Wahab Quek, Chai TK7885 Computer engineering Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction with machines and robots are still far from ideal. It is acknowledged that human can communicate effectively with each other through the telephony system. This situation motivates many researchers to study in depth the human communication system, with emphasis on its ability to express and infer emotion for effective social communication. Understanding the interlocutors’ emotion and recognizing the listeners’ perception is the key to boost communication effectiveness and interaction. Nonetheless, the perceived emotion is subjective and very much dependent on culture, environment and the pre-emotional state of the listener. Attempts have been made to understand the influence of culture in speech emotion and researchers have reported mixed findings that lead us to believe there are some common acoustical characteristics that enable similar emotion to be discriminated universally across culture. Yet there are unique speech attributes that facilitate exclusive emotion recognition of a particular culture. Understanding culture dependency is thus important to the performance of the speech emotion recognition system. In this paper three different speech emotion databases; namely: Berlin Emo-db, NTU_American and NTU_Asian dataset were selected to represent three different cultures of European, American and Asian respectively focusing on three basic emotions of anger, happiness and sadness with neutral acting as a reference. Different data arrangements with accordance to varying degree of culture dependency were designed for the experimental setup to provide better understanding of inter-cultural and intra-cultural effect in recognizing the speech emotion. Features were extracted using Mel Frequency Cepstral Co-effi- cient (MFCC) method and classified with neural network (Multi Layer Perceptron (MLP)) and fuzzy neural networks; namely: Adaptive Network Fuzzy Inference System (ANFIS) and Generic Self-Organizing Fuzzy Neural Network (GenSOFNN) representing precise and linguistic fuzzy rule conjuncts respectively. From the experimental results, it can be observed that culture influences the speech emotion recognition accuracy. 75% accuracy performance was recorded for generalized homogeneous intra-cultural experiments whereas the accuracy performance dropped to almost as low as chance probability (25% for 4 classes) for both homogeneous and heterogeneous mixed-cultural inter-culture experiments. The two-stage culture-sensitive speech emotion recognition approach was subsequently proposed to discriminate culture and speech emotion. Results of the analysis show potential of using the proposed technique to recognize culture-influenced speech emotion, which can be extended in many applications, for instance call center and intelligent vehicle. Such analysis may help us to better understand the culture dependency of speech emotion and as a result the accuracy performance of the speech emotion recognition system can be boosted. Elsevier 2012 Article REM application/pdf en http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf Kamaruddin, Norhaslinda and Abdul Rahman, Abdul Wahab and Quek, Chai (2012) Cultural dependency analysis for understanding speech emotion. Expert Systems with Applications, 39. pp. 5115-5113. ISSN 0957-4174 http://www.sciencedirect.com/science/article/pii/S0957417411015715
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
topic TK7885 Computer engineering
spellingShingle TK7885 Computer engineering
Kamaruddin, Norhaslinda
Abdul Rahman, Abdul Wahab
Quek, Chai
Cultural dependency analysis for understanding speech emotion
description Speech has been one of the major communication medium for years and will continue to do so until video communication becomes widely available and easily accessible. Although numerous technologies have been developed to improve the effectiveness of speech communication system, human interaction with machines and robots are still far from ideal. It is acknowledged that human can communicate effectively with each other through the telephony system. This situation motivates many researchers to study in depth the human communication system, with emphasis on its ability to express and infer emotion for effective social communication. Understanding the interlocutors’ emotion and recognizing the listeners’ perception is the key to boost communication effectiveness and interaction. Nonetheless, the perceived emotion is subjective and very much dependent on culture, environment and the pre-emotional state of the listener. Attempts have been made to understand the influence of culture in speech emotion and researchers have reported mixed findings that lead us to believe there are some common acoustical characteristics that enable similar emotion to be discriminated universally across culture. Yet there are unique speech attributes that facilitate exclusive emotion recognition of a particular culture. Understanding culture dependency is thus important to the performance of the speech emotion recognition system. In this paper three different speech emotion databases; namely: Berlin Emo-db, NTU_American and NTU_Asian dataset were selected to represent three different cultures of European, American and Asian respectively focusing on three basic emotions of anger, happiness and sadness with neutral acting as a reference. Different data arrangements with accordance to varying degree of culture dependency were designed for the experimental setup to provide better understanding of inter-cultural and intra-cultural effect in recognizing the speech emotion. Features were extracted using Mel Frequency Cepstral Co-effi- cient (MFCC) method and classified with neural network (Multi Layer Perceptron (MLP)) and fuzzy neural networks; namely: Adaptive Network Fuzzy Inference System (ANFIS) and Generic Self-Organizing Fuzzy Neural Network (GenSOFNN) representing precise and linguistic fuzzy rule conjuncts respectively. From the experimental results, it can be observed that culture influences the speech emotion recognition accuracy. 75% accuracy performance was recorded for generalized homogeneous intra-cultural experiments whereas the accuracy performance dropped to almost as low as chance probability (25% for 4 classes) for both homogeneous and heterogeneous mixed-cultural inter-culture experiments. The two-stage culture-sensitive speech emotion recognition approach was subsequently proposed to discriminate culture and speech emotion. Results of the analysis show potential of using the proposed technique to recognize culture-influenced speech emotion, which can be extended in many applications, for instance call center and intelligent vehicle. Such analysis may help us to better understand the culture dependency of speech emotion and as a result the accuracy performance of the speech emotion recognition system can be boosted.
format Article
author Kamaruddin, Norhaslinda
Abdul Rahman, Abdul Wahab
Quek, Chai
author_facet Kamaruddin, Norhaslinda
Abdul Rahman, Abdul Wahab
Quek, Chai
author_sort Kamaruddin, Norhaslinda
title Cultural dependency analysis for understanding speech emotion
title_short Cultural dependency analysis for understanding speech emotion
title_full Cultural dependency analysis for understanding speech emotion
title_fullStr Cultural dependency analysis for understanding speech emotion
title_full_unstemmed Cultural dependency analysis for understanding speech emotion
title_sort cultural dependency analysis for understanding speech emotion
publisher Elsevier
publishDate 2012
url http://irep.iium.edu.my/8562/7/Cultural_dependency_analysis_for_understanding_speech_emotion.pdf
http://irep.iium.edu.my/8562/
http://www.sciencedirect.com/science/article/pii/S0957417411015715
_version_ 1643606149777326080
score 13.211869