Web application for lipreading

Communication happens every day in our daily lives. However, there are conditions where the communication occurs in an environment which impedes the listener from listening to the message clearly. Therefore, this project aims to develop a web application that can perform lipreading using an existing...

詳細記述

保存先:
書誌詳細
第一著者: Lau, Yee Lin
フォーマット: Final Year Project / Dissertation / Thesis
出版事項: 2024
主題:
オンライン・アクセス:http://eprints.utar.edu.my/6818/1/2005403_LAU_YEE_LIN.pdf
http://eprints.utar.edu.my/6818/
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
その他の書誌記述
要約:Communication happens every day in our daily lives. However, there are conditions where the communication occurs in an environment which impedes the listener from listening to the message clearly. Therefore, this project aims to develop a web application that can perform lipreading using an existing deep learning model, LipCoordNet. It allows users to upload video to the web application and the application will generate text and video output for the users to visualize the speech instead of listening to the sounds. The users can choose to download the predicted text to their own device for future usage. Based on the output of the lipreading, the average word error rate (WER) and character error rate (CER) of an Asian speaker and a Native speaker is calculated, resulting in the average WER and CER value of the Asian speaker being higher than that of the Native speaker. To reduce the WER and CER of the sentences spoken by Asian speakers, efforts have been made in trying to train the LipCoordNet model with the Asian speakers dataset. 270 Asian speaker dataset has been collected with 27 Asian speakers speaking 10 sentences each. For the evaluation of the usability of the web application, five respondents are selected to participate in the system usability testing and contribute to the system usability scale (SUS) score. The SUS score obtained is 87.5, indicating that the system receives a grade A with the adjective rating of Excellent.