Unsupervised view and rate invariant clustering of video sequences

Pavan Turaga, Ashok Veeraraghavan, Rama Chellappa

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

Original languageEnglish (US)
Pages (from-to)353-371
Number of pages19
JournalComputer Vision and Image Understanding
Volume113
Issue number3
DOIs
StatePublished - Mar 2009
Externally publishedYes

Fingerprint

Invariance
Luminance
Dynamical systems
Textures
Cameras
Color

Keywords

  • Affine invariance
  • Cascade of linear dynamical systems
  • Rate invariance
  • Summarization
  • Surveillance
  • Video clustering
  • View invariance

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Unsupervised view and rate invariant clustering of video sequences. / Turaga, Pavan; Veeraraghavan, Ashok; Chellappa, Rama.

In: Computer Vision and Image Understanding, Vol. 113, No. 3, 03.2009, p. 353-371.

Research output: Contribution to journalArticle

Turaga, Pavan ; Veeraraghavan, Ashok ; Chellappa, Rama. / Unsupervised view and rate invariant clustering of video sequences. In: Computer Vision and Image Understanding. 2009 ; Vol. 113, No. 3. pp. 353-371.
@article{e7d7cdb7e99144028e882bec3eeed1cc,
title = "Unsupervised view and rate invariant clustering of video sequences",
abstract = "Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.",
keywords = "Affine invariance, Cascade of linear dynamical systems, Rate invariance, Summarization, Surveillance, Video clustering, View invariance",
author = "Pavan Turaga and Ashok Veeraraghavan and Rama Chellappa",
year = "2009",
month = "3",
doi = "10.1016/j.cviu.2008.08.009",
language = "English (US)",
volume = "113",
pages = "353--371",
journal = "Computer Vision and Image Understanding",
issn = "1077-3142",
publisher = "Academic Press Inc.",
number = "3",

}

TY - JOUR

T1 - Unsupervised view and rate invariant clustering of video sequences

AU - Turaga, Pavan

AU - Veeraraghavan, Ashok

AU - Chellappa, Rama

PY - 2009/3

Y1 - 2009/3

N2 - Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

AB - Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

KW - Affine invariance

KW - Cascade of linear dynamical systems

KW - Rate invariance

KW - Summarization

KW - Surveillance

KW - Video clustering

KW - View invariance

UR - http://www.scopus.com/inward/record.url?scp=59349120959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=59349120959&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2008.08.009

DO - 10.1016/j.cviu.2008.08.009

M3 - Article

VL - 113

SP - 353

EP - 371

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

SN - 1077-3142

IS - 3

ER -