Learning action dictionaries from video

Pavan Turaga, Rama Chellappa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Summarizing the contents of a video containing human activities is an important problem in computer vision and has important applications in automated surveillance systems. Summarizing a video requires one to identify and learn a 'vocabulary' of action-phrases corresponding to specific events and actions occurring in the video. We propose a generative model for dynamic scenes containing human activities as a composition of independent action-phrases - each of which is derived from an underlying vocabulary. Given a long video sequence, we propose a completely unsupervised approach to learn the vocabulary. Once the vocabulary is learnt, a video segment can be decomposed into a collection of phrases for summarization. We then describe methods to learn the correlations between activities and sequentiality of events. We also propose a novel method for building invariances to spatial transforms in the summarization scheme.

Original languageEnglish (US)
Title of host publication2008 IEEE International Conference on Image Processing, ICIP 2008 Proceedings
Pages1704-1707
Number of pages4
DOIs
StatePublished - 2008
Event2008 IEEE International Conference on Image Processing, ICIP 2008 - San Diego, CA, United States
Duration: Oct 12 2008Oct 15 2008

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Other

Other2008 IEEE International Conference on Image Processing, ICIP 2008
CountryUnited States
CitySan Diego, CA
Period10/12/0810/15/08

Keywords

  • Activity analysis
  • Video summarization

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Fingerprint Dive into the research topics of 'Learning action dictionaries from video'. Together they form a unique fingerprint.

Cite this