Text this: Real-time human action recognition using stacked sparse autoencoders