Next: Axiomatic scale-space formulations Up: Scale-space: A framework for Previous: The need for multi-scale
Scale-space representation: Definition and basic ideas
Scale-space theory is a framework for early visual operations,
which has been developed by the computer vision community
(in particular by Witkin [], Koenderink [],
Yuille and Poggio [], Lindeberg []
and Florack [])
to handle the above-mentioned multi-scale nature of image data.
A main argument behind its construction is that if no prior
information is available about what are the appropriate scales
for a given data set,
then the only reasonable approach for an uncommitted vision
system is to represent the input data at multiple scales.
This means that the original signal should be embedded into
a one-parameter family of derived signals,
in which fine-scale structures are successively suppressed
(see figure 1).
How should such an idea be carried out in practice?
A crucial requirement is that structures at coarse scales
in the multi-scale representation should constitute
simplifications of corresponding structures at finer scales--they
should not be accidental phenomena created by the method
for suppressing fine-scale structures.
This idea has been formalized in a variety of ways
by different authors.
A noteworthy coincidence is that similar conclusions
can be obtained from several different starting points.
A main result is that if rather general conditions
are imposed on the types of computations that are to be performed,
then convolution by the Gaussian kernel and its derivatives is
singled out as a canonical class of smoothing transformations.
The requirements (scale-space axioms) that specify the uniqueness
are essentially linearity and spatial shift invariance,
combined with different ways of formalizing the notion
that new structures should not be created in the transformation
from fine to coarse scales.
In summary,
for any N-dimensional signal ,
its scale-space representation
is defined by
Figure 2(a) shows the result of applying Gaussian smoothing to a one-dimensional signal in this way. Notice how this successive smoothing captures the intuitive notion of fine-scale information being suppressed, and the signals becoming successively smoother. Figure 3 gives a corresponding example for a two-dimensional image. Here, to emphasize the local variations in the grey-level landscape, local minima in the grey-level images at each scale have been indicated by dark blobs (grey-level blobs with spatial extent determined from a certain watershed analogy, which essentially describes how large a region associated with a local minimum can be filled with water, without water flooding over to regions associated with other local minima). As can be seen, mainly small blobs due to noise and texture are detected at fine scales. After a small amount of smoothing, the buttons on the keyboard manifest themselves as distinct minima, whereas at even coarser scales they merge to one unit (the keyboard). Also other dominant dark image structures (such as the calculator, the cord and the receiver) appear as single blobs at coarser scales. This example gives one illustration of the types of hierarchical shape decompositions that can be obtained by varying the scale parameter in the scale-space representation. The relations between image structures at different scales induced in this way is referred to as deep structure [, ].
Next: Axiomatic scale-space formulations Up: Scale-space: A framework for Previous: The need for multi-scale Tony Lindeberg Tue Jul 1 14:57:47 MET DST 1997 |