From videos to verbs: Mining videos for activities using a cascade of dynamical systems

Pavan K. Turaga, Ashok Veeraraghavan, Rama Chellappa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Scopus citations

Abstract

Clustering video sequences in order to infer and extract activities from a single video stream is an extremely important problem and has significant potential in video indexing, surveillance, activity discovery and event recognition. Clustering a video sequence into activities requires one to simultaneously recognize activity boundaries (activity consistent subsequences) and cluster these activity subsequences. In order to do this, we build a generative model for activities (in video) using a cascade of dynamical systems and show that this model is able to capture and represent a diverse class of activities. We then derive algorithms to learn the model parameters from a video stream and also show how a single video sequence may be clustered into different clusters where each cluster represents an activity. We also propose a novel technique to build affine, view, rate invariance of the activity into the distance metric for clustering. Experiments show that the clusters found by the algorithm correspond to semantically meaningful activities.

Original languageEnglish (US)
Title of host publication2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
DOIs
StatePublished - 2007
Externally publishedYes
Event2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07 - Minneapolis, MN, United States
Duration: Jun 17 2007Jun 22 2007

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Other

Other2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
Country/TerritoryUnited States
CityMinneapolis, MN
Period6/17/076/22/07

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'From videos to verbs: Mining videos for activities using a cascade of dynamical systems'. Together they form a unique fingerprint.

Cite this