Text this: Gene selection for high dimensional data using k-means clustering algorithm and statistical approach