Text this: Data pre-processing on web server logs for generalized association rules mining algorithm