A black-box approach for response quality evaluation of conversational agent systems

The evaluation of conversational agents or chatterbots question answering systems is a major research area that needs much attention. Before the rise of domain-oriented conversational agents based on natural language understanding and reasoning, evaluation is never a problem as information retrieval...

Full description

Saved in:
Bibliographic Details
Main Authors: Goh, Ong Sing, Ardil, C., Wong, W., Fung, C.C.
Format: Article
Language:en
Published: World Academy of Science Engineering and Technology 2007
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/12389/1/A_Black-box_Approach_for_Response_Quality_evaluation_of_the_conversation_agent_system.pdf
http://eprints.utem.edu.my/id/eprint/12389/
http://researchrepository.murdoch.edu.au/991/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The evaluation of conversational agents or chatterbots question answering systems is a major research area that needs much attention. Before the rise of domain-oriented conversational agents based on natural language understanding and reasoning, evaluation is never a problem as information retrieval-based metrics are readily available for use. However, when chatterbots began to become more domain specific, evaluation becomes a real issue. This is especially true when understanding and reasoning is required to cater for a wider variety of questions and at the same time to achieve high quality responses. This paper discusses the inappropriateness of the existing measures for response quality evaluation and the call for new standard measures and related considerations are brought forward. As a short-term solution for evaluating response quality of conversational agents, and to demonstrate the challenges in evaluating systems of different nature, this research proposes a blackbox approach using observation, classification scheme and a scoring mechanism to assess and rank three example systems, AnswerBus,START and AINI.