Scale-space theory: A basic tool for analysing structures at different scales

Tony Lindeberg

Journal of Applied Statistics, 21(2), pp. 224--270, 1994. (Supplement on Advances in Applied Statistics: Statistics and Images: 2).

Also available as technical report ISRN KTH/NA/P--93/07--SE.

Abstract

An inherent property of objects in the world is that they only exist as meaningful entities over certain ranges of scale. If one aims at describing the structure of unknown real-world signals, then a multi-scale representation of data is of crucial importance.

This article gives a tutorial review of a special type of multi-scale representation, linear scale-space representation, which has been developed by the computer vision community in order to handle image structures at different scales in a consistent manner. The basic idea is to embed the original signal into a one-parameter family of gradually smoothed signals, in which the fine scale details are successively suppressed.

Under rather general conditions on the type of computations that are to performed at the first stages of visual processing, in what can be termed the visual front end, it can be shown that the Gaussian kernel and its derivatives are singled out as the only possible smoothing kernels. The conditions that specify the Gaussian kernel are, basically, linearity and shift-invariance combined with different ways of formalizing the notion that structures at coarse scales should correspond to simplifications of corresponding structures at fine scales --- they should not be accidental phenomena created by the smoothing method. Notably, several different ways of choosing scale-space axioms give rise to the same conclusion.

The output from the scale-space representation can be used for a variety of early visual tasks; operations like feature detection, feature classification and shape computation can be expressed directly in terms of (possibly non-linear) combinations of Gaussian derivatives at multiple scales. In this sense, the scale-space representation can serve as a basis for early vision.

During the last few decades a number of other approaches to multi-scale representations have been developed, which are more or less related to scale-space theory, notably the theories of pyramids, wavelets and multi-grid methods. Despite their qualitative differences, the increasing popularity of each of these approaches indicates that the crucial notion of scale is increasingly appreciated by the computer vision community and by researchers in other related fields.

An interesting similarity with biological vision is that the scale-space operators closely resemble receptive field profiles registered in neurophysiological studies of the mammalian retina and visual cortex.

Full paper: (PDF 933 kb)

Responsible for this page: Tony Lindeberg