Text this: Phrase-based image caption generator with hierarchical LSTM network