Reinforcement learning regimes have been shown to be capable of learning animat behaviours such as 'obstacle avoidance' and 'wall following'. Such behaviours can usually be learned more quickly using ordinary supervised methods, since in this case the learner receives more direct feedback. However, 'conditional approach' behaviour (move in on small objects but stand clear of large ones) seems to be hard to learn even by neural network learning methods such as backpropagation. The paper presents the results of a study which investigated this behaviour and shows how the 'hardness' of the behaviour can be accounted for in statistical terms.
Download compressed postscript file