Text this: Top-down natural language query approach for embodied conversational agent