Hilary Buxton, Shaogang Gong
We review possible vision architectures and argue that the Bayesian network formalism and belief update mechanism provide for the effective knowledge representation and efficient reasoning required in advanced visual surveillance systems. Here we are not only tracking moving objects but also interpreting their patterns of behaviour which means that solving the information integration problem becomes even more important to ensure robust performance on real video data. We need conceptual knowledge of both the scene and the visual task to be built into the networks at the appropriate levels providing prior constraints. We also need control of the system using dynamic attention and selective processing. The Bayesian belief network techniques allow us both this prior knowledge and task-based control as well as allowing us to model the dynamic dependencies between the parameters involved in the visual interpretation. Any start-up phase in dynamic vision will then be weakly constrained to deliver the most likely initial interpretation of the visual evidence. Later the interpretation will adapt to further incoming evidence under the current expectations. Thus both data-driven and hypothesis-driven processing are combined in the parameter network. We illustrate these arguments using experimental results from applying these techniques in traffic surveillance applications and propose a purposive approach to the design and integration of such vision systems.
This paper is not available online