Text this: Monocular viewpoint invariant human activity recognition