Local Descriptors for Spatio-Temporal Recognition

Ivan Laptev and Tony Lindeberg

ECCV'04 Workshop on Spatial Coherence for Visual Motion Analysis, (Prague, Czech Republic), May 2004.
Springer Lecture Notes in Computer Science, volume 3667, pp. 91-103.


This paper presents and investigates a set of local space-time descriptors for representing and recognizing motion patterns in video. Following the idea of local features in the spatial domain, we use the notion of space-time interest points and represent video data in terms of local space-time events. To describe such events, we define several types of image descriptors over local spatio-temporal neighborhoods and evaluate these descriptors in the context of recognizing human activities. In particular, we compare motion representations in terms of spatio-temporal jets, position dependent histograms, position independent histograms, and principal component analysis computed for either spatio-temporal gradients or optic flow. An experimental evaluation on a video database with human actions shows that high classification performance can be achieved, and that there is a clear advantage of using local position dependent histograms, consistent with previously reported findings regarding spatial recognition.

PDF: (0.7Mb)

Related projects: Recognition of human actions

Related publications: (Velocity adaptation of space-time interest points) (Recognizing Human Actions: a Local SVM Approach) (Space-time interest points) (Interest point detection and scale selection in space-time) (Velocity-adaptation of spatio-temporal receptive fields for direct recognition of activities: An experimental study) (Linear spatio-temporal scale-space) (Monograph on scale-space theory)

Responsible for this page: Tony Lindeberg