Multilingual profanity detection API using deep learning

Profanity is an offensive use of language that is deemed impolite and rude. Profanity on the Internet is often associated with abusive intentions to cause psychological harm. A study shows that there has been a 70% increase in hate speech among teens since the COVID-19 lockdown began [6]. To coun...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Dao Ern
Format: Final Year Project / Dissertation / Thesis
Published: 2022
Subjects:
Online Access:http://eprints.utar.edu.my/4703/1/fyp_CS_2022_LDE.pdf
http://eprints.utar.edu.my/4703/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Profanity is an offensive use of language that is deemed impolite and rude. Profanity on the Internet is often associated with abusive intentions to cause psychological harm. A study shows that there has been a 70% increase in hate speech among teens since the COVID-19 lockdown began [6]. To counter the increasing prevalence of profanity in digital spaces, automated profanity detection systems can be used. However, most of the existing profanity detection systems used the list-based approach, which is flawed. The user can deliberately alternate the spellings of profane words to bypass the detection. Even with some of them using the machine learning approach, they are still far from perfect as they are not able to understand the context of words used. Therefore, this project proposes a deep learning model to detect profanity in text messages. The proposed model is capable of supporting multiple languages, understanding the context of words, and recognizing alternating spellings of profanity in text messages. Moreover, the model will be deployed as a REST ful API with 2 endpoints so that it can be implemented in other applications and websites easily. Apart from the multilingual profanity detection model, this project also contributes by providing the source code of the proposed system. The system will be written as an open-source project, anyone on the Internet who is interested in profanity detection can view the source code freely.