Text this: Data filtering framework for preserving meaningful data records from streams of unstructured weather data