On the analysis of differential hebbian learning in closed-loop behavioral systems
Tomas Kulvicius, Christoph Kolodziejski, Minija TamosiunaiteandBernd Porr, and Florentin Wörgötter (2009)
In: Frontiers in Computational Neuroscience. Conference Abstract: Bernstein Conference on Computational Neuroscience. Frontiers. ( BibTeX export )
Behaving systems form a closed loop with their environment. If the environment is not too complex, one can describe (linear) systems of this kind also in the closed loop case by methods from systems theory. Things become much more complicated as soon as one allows the controller to change, for example by learning. Several studies exist that try to analyze closed loop systems from an information point of view (Prokopenko et al., 2006; Klyubin et al., 2008), however only few attempts exist that consider learning (Lungarella and Sporns, 2006; Porr et al., 2006). In this study we will focus on the following two questions. 1) To what degree is it possible to describe the temporal development of closed loop adaptive systems using only knowledge about their initial configuration, their learning mechanism and knowledge about the structure of the world? and 2) Given a certain complexity of the world can we predict which system from a given class would be the best?
We focus on systems that perform differential hebbian learning, where we simulate agents which learn an obstacle avoidance task. In the first part of our study we provide an analytical solution for the temporal development of such systems. In the second part we define energy and entropy measures. We analyze the development of the system measures during learning by testing different robots in environments of different complexity.
In answer to the questions above we find (1) that these systems have a specific sensor-motor configuration and this leads to a biphasic weight development. It was possible, by using the measured temporal characteristics of the robot’s behavior together with some assumptions on the amplitude change of sensory inputs, to quite accurately calculate such a weight development in an analytical way. (2) Using our system measures we also show that learning equalizes the energy uptake across agents and worlds. However, when judging learning speed and complexity of the resulting behavior one finds a trade-off and some agents will be better than others in the different worlds tested.
Our study suggests that only together with some information on the general structure of the development of their descriptive parameters, analytical solutions for our robots can be still found for their temporal development. By using energy and entropy measures and investigating their development during learning we have shown that within well-specified scenarios there are indeed agents which are optimal with respect to their structure and adaptive properties. As a consequence, this study may help leading to better understanding of the complex dynamics of learning&behaving systems.