Text this: Text Extraction Algorithm for Web Text Classification