Text this: Efficient NASNetMobile-enhanced Vision Transformer for weakly supervised video anomaly detection