Acquiring Contour Following Behaviour in Robotics through Q-Learning and Image-based States

In this work a visual and reactive contour following behaviour is learned by reinforcement. With artificial vision the environment is perceived in 3D, and it is possible to avoid obstacles that are invisible to other sensors that are more common in mobile robotics. Reinforcement learning reduces the need for intervention in behaviour design, and simplifies its adjustment to the environment, the robot and the task. In order to facilitate its generalisation to other behaviours and to reduce the role of the designer, we propose a regular image-based codification of states. Even though this is much more difficult, our implementation converges and is robust. Results are presented with a Pioneer 2 AT on a Gazebo 3D simulator.

Route Training in Mobile Robotics through System Identification

Fundamental sensor-motor couplings form the backbone of most mobile robot control tasks, and often need to be implemented fast, efficiently and nevertheless reliably. Machine learning techniques are therefore often used to obtain the desired sensor-motor competences. In this paper we present an alternative to established machine learning methods such as artificial neural networks, that is very fast, easy to implement, and has the distinct advantage that it generates transparent, analysable sensor-motor couplings: system identification through nonlinear polynomial mapping. This work, which is part of the RobotMODIC project at the universities of Essex and Sheffield, aims to develop a theoretical understanding of the interaction between the robot and its environment. One of the purposes of this research is to enable the principled design of robot control programs. As a first step towards this aim we model the behaviour of the robot, as this emerges from its interaction with the environment, with the NARMAX modelling method (Nonlinear, Auto-Regressive, Moving Average models with eXogenous inputs). This method produces explicit polynomial functions that can be subsequently analysed using established mathematical methods. In this paper we demonstrate the fidelity of the obtained NARMAX models in the challenging task of robot route learning; we present a set of experiments in which a Magellan Pro mobile robot was taught to follow four different routes, always using the same mechanism to obtain the required control law.