Fuzzy based multi-source data fusion for children's age estimation

Estimation of speaker's age is a challenge in speech processing area. This paper a novel approach for estimating a speaker's age is addressed. The method employs a "divide and conquer" strategy wherein the processing speech data are divided into six groups based on the vowel clas...

Full description

Saved in:
Bibliographic Details
Main Authors: Mirhassani, S.M., Zourmand, A., Ting, H.N.
Format: Conference or Workshop Item
Language:English
Published: 2014
Subjects:
Online Access:http://eprints.um.edu.my/11391/1/0001.pdf
http://eprints.um.edu.my/11391/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Estimation of speaker's age is a challenge in speech processing area. This paper a novel approach for estimating a speaker's age is addressed. The method employs a "divide and conquer" strategy wherein the processing speech data are divided into six groups based on the vowel classes. Afterward, Mel-frequency cepstral coefficients are computed for each group and single layer feed-forward neural networks are applied to the features to make a primary decision. The extreme learning machine (ELM) method is used to train the classifiers. Subsequently, fuzzy data fusion is employed to provide an overall decision by aggregating the classifier's outputs. The results are then compared with vowel independent age estimation based on ELM and other well-known classification methods, including support vector machine and Knearest neighbor. The processing speech data include six Malay vowels collected from 360 Malay children aged between 7 and 12 years. Experiments conducted based on six age groups revealed that fuzzy fusion of the classifier's outputs resulted in considerable improvement of up to 72.63% in age estimation accuracy. Moreover, the fuzzy fusion of decisions aggregated complimentary information of a speaker's age from varied speech sources.