TY - GEN
T1 - From videos to verbs
T2 - 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
AU - Turaga, Pavan K.
AU - Veeraraghavan, Ashok
AU - Chellappa, Rama
PY - 2007
Y1 - 2007
N2 - Clustering video sequences in order to infer and extract activities from a single video stream is an extremely important problem and has significant potential in video indexing, surveillance, activity discovery and event recognition. Clustering a video sequence into activities requires one to simultaneously recognize activity boundaries (activity consistent subsequences) and cluster these activity subsequences. In order to do this, we build a generative model for activities (in video) using a cascade of dynamical systems and show that this model is able to capture and represent a diverse class of activities. We then derive algorithms to learn the model parameters from a video stream and also show how a single video sequence may be clustered into different clusters where each cluster represents an activity. We also propose a novel technique to build affine, view, rate invariance of the activity into the distance metric for clustering. Experiments show that the clusters found by the algorithm correspond to semantically meaningful activities.
AB - Clustering video sequences in order to infer and extract activities from a single video stream is an extremely important problem and has significant potential in video indexing, surveillance, activity discovery and event recognition. Clustering a video sequence into activities requires one to simultaneously recognize activity boundaries (activity consistent subsequences) and cluster these activity subsequences. In order to do this, we build a generative model for activities (in video) using a cascade of dynamical systems and show that this model is able to capture and represent a diverse class of activities. We then derive algorithms to learn the model parameters from a video stream and also show how a single video sequence may be clustered into different clusters where each cluster represents an activity. We also propose a novel technique to build affine, view, rate invariance of the activity into the distance metric for clustering. Experiments show that the clusters found by the algorithm correspond to semantically meaningful activities.
UR - http://www.scopus.com/inward/record.url?scp=34948834193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34948834193&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2007.383170
DO - 10.1109/CVPR.2007.383170
M3 - Conference contribution
AN - SCOPUS:34948834193
SN - 1424411807
SN - 9781424411801
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
BT - 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
Y2 - 17 June 2007 through 22 June 2007
ER -