A Cognitive Robot Collaborative Reinforcement Learning Algorithm

A cognitive collaborative reinforcement learning algorithm (CCRL) that incorporates an advisor into the learning process is developed to improve supervised learning. An autonomous learner is enabled with a self awareness cognitive skill to decide when to solicit instructions from the advisor. The learner can also assess the value of advice, and accept or reject it. The method is evaluated for robotic motion planning using simulation. Tests are conducted for advisors with skill levels from expert to novice. The CCRL algorithm and a combined method integrating its logic with Clouse-s Introspection Approach, outperformed a base-line fully autonomous learner, and demonstrated robust performance when dealing with various advisor skill levels, learning to accept advice received from an expert, while rejecting that of less skilled collaborators. Although the CCRL algorithm is based on RL, it fits other machine learning methods, since advisor-s actions are only added to the outer layer.




References:
[1] C. J. C. H. Watkins, "Learning from Delayed Rewards," Ph.D.
dissertation, Psychology Dept., Cambridge University, 1989.
[2] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction,
Cambridge, MA: MIT Press, 1998.
[3] T. G. Dietterich, "Hierarchical reinforcement learning with the maxq
value function decomposition," Journal of Artificial Intelligence
Research, 1999, vol. 13, pp. 227-303.
[4] V. N. Papudesi and M. Huber, "Learning from reinforcement and advice
using composite reward functions," in Proc. 16th Int. FLAIRS Conf., pp.
361-365, St. Augustine, FL, 2003.
[5] L. Mihalkova and R. Mooney, "Using active relocation to aid
reinforcement," in Proc. 19th Int. FLAIRS Conf., Florida, 2006,
[6] U. Kartoun, H. Stern, and Y. Edan, "Human-robot collaborative learning
system for inspection," IEEE Int. Conf. on Systems, Man, and
Cybernetics, pp. 4249-4255, Taipei, Taiwan, 2006.
[7] V. U. Cetina, "Supervised Reinforcement Learning Using Behavior
Models," IEEE Computer Society 6th Int. Conf. on Machine Learning and
Applications, Cincinnati, Ohio, USA, 2007.
[8] C. Breazeal and A, Thomaz, "Learning from Human Teachers with
Socially Guided Exploration," IEEE Int. Conf. on Robotics and
Automation, Pasadena, CA, USA, 2008.
[9] J. A. Clouse, "An introspection approach to querying a trainer," technical
report 96-13, University of Massachusetts, Amherst, MA, 1996.
[10] M. A. Goodrich, R. D. R. Olsen, J. W. Crandall and T. J. Palmer,
Experiments in adjustable autonomy," in Proceedings of the IJCAI
Workshop on Autonomy, Delegation and Control: Interacting with
Intelligent Agents, 2001, pp. 1624-1629.