Unsupervised view and rate invariant clustering of video sequences

Pavan Turaga; Ashok Veeraraghavan; Rama Chellappa

doi:10.1016/j.cviu.2008.08.009

Unsupervised view and rate invariant clustering of video sequences

Pavan Turaga, Ashok Veeraraghavan, Rama Chellappa

Research output: Contribution to journal › Article › peer-review

37 Scopus citations

Abstract

Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

Original language	English (US)
Pages (from-to)	353-371
Number of pages	19
Journal	Computer Vision and Image Understanding
Volume	113
Issue number	3
DOIs	https://doi.org/10.1016/j.cviu.2008.08.009
State	Published - Mar 2009
Externally published	Yes

Keywords

Affine invariance
Cascade of linear dynamical systems
Rate invariance
Summarization
Surveillance
Video clustering
View invariance

ASJC Scopus subject areas

Software
Signal Processing
Computer Vision and Pattern Recognition

Access to Document

10.1016/j.cviu.2008.08.009

Cite this

@article{e7d7cdb7e99144028e882bec3eeed1cc,

title = "Unsupervised view and rate invariant clustering of video sequences",

abstract = "Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.",

keywords = "Affine invariance, Cascade of linear dynamical systems, Rate invariance, Summarization, Surveillance, Video clustering, View invariance",

author = "Pavan Turaga and Ashok Veeraraghavan and Rama Chellappa",

note = "Funding Information: Some parts of this work were presented at CVPR 2007 [1] . This research was funded (in part) by the U.S. Government VACE Program. ",

year = "2009",

month = mar,

doi = "10.1016/j.cviu.2008.08.009",

language = "English (US)",

volume = "113",

pages = "353--371",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press Inc.",

number = "3",

}

TY - JOUR

T1 - Unsupervised view and rate invariant clustering of video sequences

AU - Turaga, Pavan

AU - Veeraraghavan, Ashok

AU - Chellappa, Rama

N1 - Funding Information: Some parts of this work were presented at CVPR 2007 [1] . This research was funded (in part) by the U.S. Government VACE Program.

PY - 2009/3

Y1 - 2009/3

N2 - Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

AB - Videos play an ever increasing role in our everyday lives with applications ranging from news, entertainment, scientific research, security and surveillance. Coupled with the fact that cameras and storage media are becoming less expensive, it has resulted in people producing more video content than ever before. This necessitates the development of efficient indexing and retrieval algorithms for video data. Most state-of-the-art techniques index videos according to the global content in the scene such as color, texture, brightness, etc. In this paper, we discuss the problem of activity-based indexing of videos. To address the problem, first we describe activities as a cascade of dynamical systems which significantly enhances the expressive power of the model while retaining many of the computational advantages of using dynamical models. Second, we also derive methods to incorporate view and rate-invariance into these models so that similar actions are clustered together irrespective of the viewpoint or the rate of execution of the activity. We also derive algorithms to learn the model parameters from a video stream and demonstrate how a single video sequence may be clustered into different clusters where each cluster represents an activity. Experimental results for five different databases show that the clusters found by the algorithm correspond to semantically meaningful activities.

KW - Affine invariance

KW - Cascade of linear dynamical systems

KW - Rate invariance

KW - Summarization

KW - Surveillance

KW - Video clustering

KW - View invariance

UR - http://www.scopus.com/inward/record.url?scp=59349120959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=59349120959&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2008.08.009

DO - 10.1016/j.cviu.2008.08.009

M3 - Article

AN - SCOPUS:59349120959

SN - 1077-3142

VL - 113

SP - 353

EP - 371

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

IS - 3

ER -

Unsupervised view and rate invariant clustering of video sequences

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this