Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment

Humans are often able to compensate for noise degradation and uncertainty in speech information by augmenting the received audio with visual information. Such bimodal perception generates a rich combination of information that can be used in the recognition of speech. However, due to wide variabilit...

Full description

Saved in:

Bibliographic Details
Main Authors:	M. Z., Ibrahim, Mulvaney, D. J., M. F., Abas
Format:	Article
Language:	en en
Published:	Asian Research Publishing Network (ARPN) 2015
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://umpir.ump.edu.my/id/eprint/12890/1/jeas_1215_3203.pdf http://umpir.ump.edu.my/id/eprint/12890/7/fkee-2015-zamri-Feature-Fusion%20based%20Audio-Visual.pdf http://umpir.ump.edu.my/id/eprint/12890/ http://www.arpnjournals.org/jeas/research_papers/rp_2015/jeas_1215_3203.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1831523556898373632
author	M. Z., Ibrahim Mulvaney, D. J. M. F., Abas
author_facet	M. Z., Ibrahim Mulvaney, D. J. M. F., Abas
author_sort	M. Z., Ibrahim
building	UMPSA Library
collection	Institutional Repository
content_provider	Universiti Malaysia Pahang Al-Sultan Abdullah
content_source	UMPSA Institutional Repository
continent	Asia
country	Malaysia
description	Humans are often able to compensate for noise degradation and uncertainty in speech information by augmenting the received audio with visual information. Such bimodal perception generates a rich combination of information that can be used in the recognition of speech. However, due to wide variability in the lip movement involved in articulation, not all speech can be substantially improved by audio-visual integration. This paper describes a feature-fusion audio-visual speech recognition (AVSR) system that extracts lip geometry from the mouth region using a combination of skin color filter, border following and convex hull, and classification using a Hidden Markov Model. The comparison of the new approach with conventional audio-only system is made when operating under simulated ambient noise conditions that affect the spoken phrases. The experimental results demonstrate that, in the presence of audio noise, the audio-visual approach significantly improves speech recognition accuracy compared with audio-only approach.
format	Article
id	my.ump.umpir.12890
institution	Universiti Malaysia Pahang
language	en en
publishDate	2015
publisher	Asian Research Publishing Network (ARPN)
record_format	eprints
spelling	my.ump.umpir.128902018-03-20T06:51:39Z http://umpir.ump.edu.my/id/eprint/12890/ Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment M. Z., Ibrahim Mulvaney, D. J. M. F., Abas TK Electrical engineering. Electronics Nuclear engineering Humans are often able to compensate for noise degradation and uncertainty in speech information by augmenting the received audio with visual information. Such bimodal perception generates a rich combination of information that can be used in the recognition of speech. However, due to wide variability in the lip movement involved in articulation, not all speech can be substantially improved by audio-visual integration. This paper describes a feature-fusion audio-visual speech recognition (AVSR) system that extracts lip geometry from the mouth region using a combination of skin color filter, border following and convex hull, and classification using a Hidden Markov Model. The comparison of the new approach with conventional audio-only system is made when operating under simulated ambient noise conditions that affect the spoken phrases. The experimental results demonstrate that, in the presence of audio noise, the audio-visual approach significantly improves speech recognition accuracy compared with audio-only approach. Asian Research Publishing Network (ARPN) 2015-12-19 Article PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/12890/1/jeas_1215_3203.pdf application/pdf en http://umpir.ump.edu.my/id/eprint/12890/7/fkee-2015-zamri-Feature-Fusion%20based%20Audio-Visual.pdf M. Z., Ibrahim and Mulvaney, D. J. and M. F., Abas (2015) Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment. ARPN Journal of Engineering and Applied Sciences, 10 (23). pp. 17521-17527. ISSN 1819-6608. (Published) http://www.arpnjournals.org/jeas/research_papers/rp_2015/jeas_1215_3203.pdf
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering M. Z., Ibrahim Mulvaney, D. J. M. F., Abas Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title	Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title_full	Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title_fullStr	Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title_full_unstemmed	Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title_short	Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment
title_sort	feature-fusion based audio-visual speech recognition using lip geometry features in noisy environment
topic	TK Electrical engineering. Electronics Nuclear engineering
url	http://umpir.ump.edu.my/id/eprint/12890/1/jeas_1215_3203.pdf http://umpir.ump.edu.my/id/eprint/12890/7/fkee-2015-zamri-Feature-Fusion%20based%20Audio-Visual.pdf http://umpir.ump.edu.my/id/eprint/12890/ http://www.arpnjournals.org/jeas/research_papers/rp_2015/jeas_1215_3203.pdf
url_provider	http://umpir.ump.edu.my/

Feature-Fusion based Audio-Visual Speech Recognition using Lip Geometry Features in Noisy Environment

Similar Items