A Black-box Approach for Response Quality Evaluation of Conversational Agent Systems

The evaluation of conversational agents or chatterbots question answering systems is a major research area that needs much attention. Before the rise of domain-oriented conversational agents based on natural language understanding and reasoning, evaluation is never a problem as information retrieval-based metrics are readily available for use. However, when chatterbots began to become more domain specific, evaluation becomes a real issue. This is especially true when understanding and reasoning is required to cater for a wider variety of questions and at the same time to achieve high quality responses. This paper discusses the inappropriateness of the existing measures for response quality evaluation and the call for new standard measures and related considerations are brought forward. As a short-term solution for evaluating response quality of conversational agents, and to demonstrate the challenges in evaluating systems of different nature, this research proposes a blackbox approach using observation, classification scheme and a scoring mechanism to assess and rank three example systems, AnswerBus, START and AINI.

A Crisis Communication Network Based on Embodied Conversational Agents System with Mobile Services

In this paper, we proposed a new framework to incorporate an intelligent agent software robot into a crisis communication portal (CCNet) in order to send alert news to subscribed users via email and other mobile services such as Short Message Service (SMS), Multimedia Messaging Service (MMS) and General Packet Radio Services (GPRS). The content on the mobile services can be delivered either through mobile phone or Personal Digital Assistance (PDA). This research has shown that with our proposed framework, the embodied conversation agents system can handle questions intelligently with our multilayer architecture. At the same time, the extended framework can take care of delivery content through a more humanoid interface on mobile devices.