-->

Scale-Space Theory in Computer Vision

Tony Lindeberg

KTH Royal Institute of Technology
Stockholm, Sweden

Abstract

The presentation starts with a philosophical discussion about computer vision in general. The aim is to put the scope of the book into its wider context, and to emphasize why the notion of scale is crucial when dealing with measured signals, such as image data. An overview of different approaches to multi-scale representation is presented, and a number special properties of scale-space are pointed out.

Then, it is shown how a mathematical theory can be formulated for describing image structures at different scales. By starting from a set of axioms imposed on the first stages of processing, it is possible to derive a set of canonical operators, which turn out to be derivatives of Gaussian kernels at different scales.

The problem of applying this theory computationally is extensively treated. A scale-space theory is formulated for discrete signals, and it demonstrated how this representation can be used as a basis for expressing a large number of visual operations. Examples are smoothed derivatives in general, as well as different types of detectors for image features, such as edges, blobs, and junctions. In fact, the resulting scheme for feature detection induced by the presented theory is very simple, both conceptually and in terms of practical implementations.

Typically, an object contains structures at many different scales, but locally it is not unusual that some of these "stand out" and seem to be more significant than others. A problem that we give special attention to concerns how to find such locally stable scales, or rather how to generate hypotheses about interesting structures for further processing. It is shown how the scale-space theory, based on a representation called the scale-space primal sketch, allows us to extract regions of interest from an image without prior information about what the image can be expected to contain. Such regions, combined with knowledge about the scales at which they occur constitute qualitative information, which can be used for guiding and simplifying other low-level processes.

Experiments on different types of real and synthetic images demonstrate how the suggested approach can be used for different visual tasks, such as image segmentation, edge detection, junction detection, and focus-of-attention. This work is complemented by a mathematical treatment showing how the behaviour of different types of image structures in scale-space can be analysed theoretically.

It is also demonstrated how the suggested scale-space framework can be used for computing direct cues to three-dimensional surface structure, using in principle only the same types of visual front-end operations that underlie the computation of image features.

Although the treatment is concerned with the analysis of visual data, the general notion of scale-space representation is of much wider generality and arises in several contexts where measured data are to be analyzed and interpreted automatically.

Responsible for this page: Tony Lindeberg