Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features

Ivan Laptev and Tony Lindeberg

Technical report CVAP245, ISRN KTH NA/P--00/12--SE. Department of Numerical Analysis and Computer Science, KTH (Royal Institute of Technology), S-100 44 Stockholm, Sweden, September 2000.

Shortened version in IEEE Workshop on Scale-Space and Morphology, Vancouver, Canada, July 2001, M. Kerckhove (Ed.), Volume 2106 of Springer Verlag Lecture Notes in Computer Science, pages 63--74.

Extended version of the underlying theory in International Journal of Computer Vision, vol. 52, number 2/3, pages 97--120, 2003.


This paper explores the use of hierarchical object representations in terms of multi-scale image features for simultaneous tracking and recognition of objects. Specifically, we consider an application to hand gesture analysis, where hand models are tracked over multiple postures (states). We propose a scale-invariant dissimilarity measure for comparing scale-space features. Based on it, we evaluate the likelihood of hierarchical, parameterized models containing different types of image features at multiple scales. The likelihood is constructed in such a way, that its maximization over different models and their parameters allows for both model selection and parameter estimation. These ideas are integrated with the framework of particle filtering, involving simultaneous tracking and recognition, and where a coarse-to-fine evaluation strategy improves computational efficiency. Based on the proposed approach, an application DrawBoard is developed, where the user controls a drawing device with a set of qualitative hand states and quantitative hand motions.

Postscript: (1.5 Mb) (778 kb)

PDF: (726 kb) (270 kb)

Video illustrating results of the method:

Extended journal version of the underlying feature matching theory: (1.5 Mb)

Related publications: (Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering) (A prototype system for computer vision based human computer interaction) (A direct approach based on feature likelihood maps) (Qualitative multi-scale feature hierarchies for object tracking) (Edge detection and ridge detection with automatic scale selection) (Monograph on scale-space theory)

Related project: Computer vision based human-computer interaction

Responsible for this page: Ivan Laptev Tony Lindeberg