Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir

The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits sep...

Full description

Saved in:
Bibliographic Details
Main Author: Abdul Aziz Saleh, Mahfoudh Ba Wazir
Format: Thesis
Published: 2018
Subjects:
Online Access:http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg
http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf
http://studentsrepo.um.edu.my/9521/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1831435362415673344
author Abdul Aziz Saleh, Mahfoudh Ba Wazir
author_facet Abdul Aziz Saleh, Mahfoudh Ba Wazir
author_sort Abdul Aziz Saleh, Mahfoudh Ba Wazir
building UM Library
collection Institutional Repository
content_provider Universiti Malaya
content_source UM Student Repository
continent Asia
country Malaysia
description The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits seperation. Digit speeches extracted features are fed into a network with long short-term memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies and require learning long-term and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects is used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using GPU. The LSTM model learning parameters is tuned for optimization purpose to achieve higher accuracy of 94% during model training. The testing results of the finest tuned parameters model shows that the LSTM model is 69% accurate in recognizing spoken Arabic digits samples. Model highest accuracy obtained when recognizing the digit zero with 80%.
format Thesis
id my.um.stud-9521
institution Universiti Malaya
publishDate 2018
record_format eprints
spelling my.um.stud-95212020-12-15T00:03:04Z Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir Abdul Aziz Saleh, Mahfoudh Ba Wazir TK Electrical engineering. Electronics Nuclear engineering The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits seperation. Digit speeches extracted features are fed into a network with long short-term memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies and require learning long-term and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects is used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using GPU. The LSTM model learning parameters is tuned for optimization purpose to achieve higher accuracy of 94% during model training. The testing results of the finest tuned parameters model shows that the LSTM model is 69% accurate in recognizing spoken Arabic digits samples. Model highest accuracy obtained when recognizing the digit zero with 80%. 2018-09 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg application/pdf http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf Abdul Aziz Saleh, Mahfoudh Ba Wazir (2018) Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir. Masters thesis, University of Malaya. http://studentsrepo.um.edu.my/9521/
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Abdul Aziz Saleh, Mahfoudh Ba Wazir
Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title_full Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title_fullStr Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title_full_unstemmed Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title_short Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
title_sort spoken arabic digits recognition using deep learning / abdulaziz saleh mahfoudh ba wazir
topic TK Electrical engineering. Electronics Nuclear engineering
url http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg
http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf
http://studentsrepo.um.edu.my/9521/
url_provider http://studentsrepo.um.edu.my/