Scalable Text Filtering System

The advancement in computing enables anyone to become information producer, resulting in rapidly growing information in the Internet. One concern arising from this phenomenon is the easy access to offensive, vulgar or obscene page by anyone with access to Internet. One of the solutions for this conc...

Full description

Saved in:
Bibliographic Details
Main Authors: Foong, Oi Mean, Ahmad Izuddin Zainal Abidin, A., Yong, S.P.
Format: Conference or Workshop Item
Published: 2006
Subjects:
Online Access:http://eprints.utp.edu.my/2569/1/Scalable_Text_Filtering.pdf
http://eprints.utp.edu.my/2569/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advancement in computing enables anyone to become information producer, resulting in rapidly growing information in the Internet. One concern arising from this phenomenon is the easy access to offensive, vulgar or obscene page by anyone with access to Internet. One of the solutions for this concern is filtering software. This paper presents a prototype called DocFilter that filters harmful content of text document without human intervention. The prototype is designed to extract each word of the document, stem the words into its root and compare each word to the list of harmful words in the hash set. Two systems evaluation were conducted to ascertain the performance of DocFilter system. Using various blocking levels, the prototype yields average filtering scores of 73.4%. The system is regarded to have produced an effective filtering accuracy of offensive words for most English text document.