Staff View: An improved method in speech signal input representation based on DTW technique for NN speech recognition system

An improved method in speech signal input representation based on DTW technique for NN speech recognition system

A pre-processing of linear predictive coefficient (LPC) features for preparation of reliable reference templates for the set of words to be recognized using the artificial neural network is presented in this paper. The paper also proposes the use of pitch feature derived from the recorded speech dat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sudirman, Rubita, Salleh, Sh. Hussain, Salleh, Shaharuddin
Format:	Article
Language:	English English
Published:	Penerbit UTM Press 2007
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.utm.my/id/eprint/8038/3/281 http://eprints.utm.my/id/eprint/8038/4/RubitaSudirman2007_AnImprovedMethodinSpeechSignal.pdf http://eprints.utm.my/id/eprint/8038/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.8038
record_format	eprints
spelling	my.utm.80382017-11-01T04:17:27Z http://eprints.utm.my/id/eprint/8038/ An improved method in speech signal input representation based on DTW technique for NN speech recognition system Sudirman, Rubita Salleh, Sh. Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering A pre-processing of linear predictive coefficient (LPC) features for preparation of reliable reference templates for the set of words to be recognized using the artificial neural network is presented in this paper. The paper also proposes the use of pitch feature derived from the recorded speech data as another input feature. The Dynamic Time Warping algorithm (DTW) is the backâ€“bone of the newly developed algorithm called DTW fixing frame algorithm (DTWâ€“FF) which is designed to perform template matching for the input preprocessing. The purpose of the new algorithm is to align the input frames in the test set to the template frames in the reference set. This frame normalization is required since NN is designed to compare data of the same length, however same speech varies in their length most of the time. By doing frame fixing, the input frames and the reference frames are adjusted to the same number of frames according to the reference frames. Another task of the study is to extract pitch features using the Harmonic Filter algorithm. After pitch extraction and linear predictive coefficient (LPC) features fixed to a desired number of frames, speech recognition using neural network can be performed and results showed a very promising solution. Result showed that as high as 98% recognition can be achieved using combination of two features mentioned above. At the end of the paper, a convergence comparison between conjugate gradient descent (CGD), Quasiâ€“Newton, and steepest gradient descent (SGD) search direction is performed and results show that the CGD outperformed the Newton and SGD. Penerbit UTM Press 2007-06 Article PeerReviewed text/html en http://eprints.utm.my/id/eprint/8038/3/281 application/pdf en http://eprints.utm.my/id/eprint/8038/4/RubitaSudirman2007_AnImprovedMethodinSpeechSignal.pdf Sudirman, Rubita and Salleh, Sh. Hussain and Salleh, Shaharuddin (2007) An improved method in speech signal input representation based on DTW technique for NN speech recognition system. Jurnal Teknologi D, 46 (D). pp. 135-149. ISSN 2180-3722 DOI:10.11113/jt.v46.291
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English English
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Sudirman, Rubita Salleh, Sh. Hussain Salleh, Shaharuddin An improved method in speech signal input representation based on DTW technique for NN speech recognition system
description	A pre-processing of linear predictive coefficient (LPC) features for preparation of reliable reference templates for the set of words to be recognized using the artificial neural network is presented in this paper. The paper also proposes the use of pitch feature derived from the recorded speech data as another input feature. The Dynamic Time Warping algorithm (DTW) is the backâ€“bone of the newly developed algorithm called DTW fixing frame algorithm (DTWâ€“FF) which is designed to perform template matching for the input preprocessing. The purpose of the new algorithm is to align the input frames in the test set to the template frames in the reference set. This frame normalization is required since NN is designed to compare data of the same length, however same speech varies in their length most of the time. By doing frame fixing, the input frames and the reference frames are adjusted to the same number of frames according to the reference frames. Another task of the study is to extract pitch features using the Harmonic Filter algorithm. After pitch extraction and linear predictive coefficient (LPC) features fixed to a desired number of frames, speech recognition using neural network can be performed and results showed a very promising solution. Result showed that as high as 98% recognition can be achieved using combination of two features mentioned above. At the end of the paper, a convergence comparison between conjugate gradient descent (CGD), Quasiâ€“Newton, and steepest gradient descent (SGD) search direction is performed and results show that the CGD outperformed the Newton and SGD.
format	Article
author	Sudirman, Rubita Salleh, Sh. Hussain Salleh, Shaharuddin
author_facet	Sudirman, Rubita Salleh, Sh. Hussain Salleh, Shaharuddin
author_sort	Sudirman, Rubita
title	An improved method in speech signal input representation based on DTW technique for NN speech recognition system
title_short	An improved method in speech signal input representation based on DTW technique for NN speech recognition system
title_full	An improved method in speech signal input representation based on DTW technique for NN speech recognition system
title_fullStr	An improved method in speech signal input representation based on DTW technique for NN speech recognition system
title_full_unstemmed	An improved method in speech signal input representation based on DTW technique for NN speech recognition system
title_sort	improved method in speech signal input representation based on dtw technique for nn speech recognition system
publisher	Penerbit UTM Press
publishDate	2007
url	http://eprints.utm.my/id/eprint/8038/3/281 http://eprints.utm.my/id/eprint/8038/4/RubitaSudirman2007_AnImprovedMethodinSpeechSignal.pdf http://eprints.utm.my/id/eprint/8038/
_version_	1643644908148359168
score	13.211869

An improved method in speech signal input representation based on DTW technique for NN speech recognition system

Similar Items