Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition

This paper presents a method to extract existing speech features in dynamic time warping path which originally was derived from LPC. This extracted feature coefficients represent as an input for neural network back-propagation. The coefficients are normalized with respect to the reference pattern ac...

Full description

Saved in:
Bibliographic Details
Main Authors: Sudirman, Rubita, Salleh, Sh-Hussain, Salleh, Shaharuddin
Format: Conference or Workshop Item
Language:en
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/1999/1/rubita06_NCS_Chiang_Mai.pdf
http://eprints.utm.my/1999/
http://www.actapress.com/PaperInfo.aspx?PaperID=23639
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1845470591544459264
author Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_facet Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_sort Sudirman, Rubita
building UTM Library
collection Institutional Repository
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
continent Asia
country Malaysia
description This paper presents a method to extract existing speech features in dynamic time warping path which originally was derived from LPC. This extracted feature coefficients represent as an input for neural network back-propagation. The coefficients are normalized with respect to the reference pattern according to the average number of frames over the samples recorded. This is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class. The new feature processing used the famous frame matching technique, which is Dynamic Time Warping (DTW) to fix the input size to a fix number of input vectors. The LPC features vectors are aligned between the source frames to the template using our DTW frame fixing (DTW-FF) algorithm. By doing frame fixing, the source and template frames are adjusted so that they have the same number of frames. The speech recognition is performed using the back-propagation neural network (BPNN) algorithm to enhance the recognition performance. The results compare DTW using LPC coefficients to BPNN with DTW-FF coefficients. Added pitch feature investigate the improvement made to the previous experiment using different number of hidden neurons.
format Conference or Workshop Item
id my.utm.eprints-1999
institution Universiti Teknologi Malaysia
language en
publishDate 2006
record_format eprints
spelling my.utm.eprints-19992017-08-24T04:26:09Z http://eprints.utm.my/1999/ Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper presents a method to extract existing speech features in dynamic time warping path which originally was derived from LPC. This extracted feature coefficients represent as an input for neural network back-propagation. The coefficients are normalized with respect to the reference pattern according to the average number of frames over the samples recorded. This is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class. The new feature processing used the famous frame matching technique, which is Dynamic Time Warping (DTW) to fix the input size to a fix number of input vectors. The LPC features vectors are aligned between the source frames to the template using our DTW frame fixing (DTW-FF) algorithm. By doing frame fixing, the source and template frames are adjusted so that they have the same number of frames. The speech recognition is performed using the back-propagation neural network (BPNN) algorithm to enhance the recognition performance. The results compare DTW using LPC coefficients to BPNN with DTW-FF coefficients. Added pitch feature investigate the improvement made to the previous experiment using different number of hidden neurons. 2006-03-29 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/1999/1/rubita06_NCS_Chiang_Mai.pdf Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition. In: IASTED International Conference on Networks and Communication System, 29-31 March 2006, Chiang Mai, Thailand. http://www.actapress.com/PaperInfo.aspx?PaperID=23639
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title_full Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title_fullStr Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title_full_unstemmed Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title_short Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
title_sort local dtw coefficients and pitch feature for back-propagation nn digits recognition
topic TK Electrical engineering. Electronics Nuclear engineering
url http://eprints.utm.my/1999/1/rubita06_NCS_Chiang_Mai.pdf
http://eprints.utm.my/1999/
http://www.actapress.com/PaperInfo.aspx?PaperID=23639
url_provider http://eprints.utm.my/