Simultaneous Event Localization and Recognition in Surveillance Video

Yikang Li; Tianshu Yu; Baoxin Li

doi:10.1109/AVSS.2018.8639169

Simultaneous Event Localization and Recognition in Surveillance Video

Yikang Li, Tianshu Yu, Baoxin Li

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

The ubiquity of video-based surveillance demands automated approaches to analysis of ever-increasing video footages. Action/Event localization and recognition are two critical capabilities in surveillance video analysis, which have been largely addressed separately in the literature. In this paper, we propose an approach to simultaneously localize and recognize visual events from raw surveillance videos, employing an end-to-end learning strategy. Our approach formulates the task as weakly-supervised sequential semantic segmentation, in which we utilize a specific convolutional RNN to capture not only the appearance and the motion information but also their temporal evolution patterns. We tested our approach on the VIRAT 2.0 dataset. The experimental results, in comparison with relevant existing state-of-the-art, suggest that the proposed approach is promising in delivering a practical solution.

Original language	English (US)
Title of host publication	Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781538692943
DOIs	https://doi.org/10.1109/AVSS.2018.8639169
State	Published - Jul 2 2018
Event	15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2018 - Auckland, New Zealand Duration: Nov 27 2018 → Nov 30 2018

Publication series

Name	Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance

Conference

Conference	15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2018
Country/Territory	New Zealand
City	Auckland
Period	11/27/18 → 11/30/18

ASJC Scopus subject areas

Signal Processing
Computer Vision and Pattern Recognition
Hardware and Architecture
Media Technology

Access to Document

10.1109/AVSS.2018.8639169

Cite this

Li, Y., Yu, T., & Li, B. (2018). Simultaneous Event Localization and Recognition in Surveillance Video. In Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance Article 8639169 (Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AVSS.2018.8639169

Simultaneous Event Localization and Recognition in Surveillance Video. / Li, Yikang; Yu, Tianshu; Li, Baoxin.
Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance. Institute of Electrical and Electronics Engineers Inc., 2018. 8639169 (Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Li, Y, Yu, T & Li, B 2018, Simultaneous Event Localization and Recognition in Surveillance Video. in Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance., 8639169, Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, Institute of Electrical and Electronics Engineers Inc., 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2018, Auckland, New Zealand, 11/27/18. https://doi.org/10.1109/AVSS.2018.8639169

Li Y, Yu T, Li B. Simultaneous Event Localization and Recognition in Surveillance Video. In Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance. Institute of Electrical and Electronics Engineers Inc. 2018. 8639169. (Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance). doi: 10.1109/AVSS.2018.8639169

Li, Yikang ; Yu, Tianshu ; Li, Baoxin. / Simultaneous Event Localization and Recognition in Surveillance Video. Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance. Institute of Electrical and Electronics Engineers Inc., 2018. (Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance).

@inproceedings{dce1d463bf8444d885e9490dc7cb26a7,

title = "Simultaneous Event Localization and Recognition in Surveillance Video",

abstract = "The ubiquity of video-based surveillance demands automated approaches to analysis of ever-increasing video footages. Action/Event localization and recognition are two critical capabilities in surveillance video analysis, which have been largely addressed separately in the literature. In this paper, we propose an approach to simultaneously localize and recognize visual events from raw surveillance videos, employing an end-to-end learning strategy. Our approach formulates the task as weakly-supervised sequential semantic segmentation, in which we utilize a specific convolutional RNN to capture not only the appearance and the motion information but also their temporal evolution patterns. We tested our approach on the VIRAT 2.0 dataset. The experimental results, in comparison with relevant existing state-of-the-art, suggest that the proposed approach is promising in delivering a practical solution.",

author = "Yikang Li and Tianshu Yu and Baoxin Li",

note = "Funding Information: This work was supported in part by a grant from ONR. Any opinions expressed in this material are those of the authors and do not necessarily react the views of ONR. Publisher Copyright: {\textcopyright} 2018 IEEE.; 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2018 ; Conference date: 27-11-2018 Through 30-11-2018",

year = "2018",

month = jul,

day = "2",

doi = "10.1109/AVSS.2018.8639169",

language = "English (US)",

series = "Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance",

}

TY - GEN

T1 - Simultaneous Event Localization and Recognition in Surveillance Video

AU - Li, Yikang

AU - Yu, Tianshu

AU - Li, Baoxin

N1 - Funding Information: This work was supported in part by a grant from ONR. Any opinions expressed in this material are those of the authors and do not necessarily react the views of ONR. Publisher Copyright: © 2018 IEEE.

PY - 2018/7/2

Y1 - 2018/7/2

N2 - The ubiquity of video-based surveillance demands automated approaches to analysis of ever-increasing video footages. Action/Event localization and recognition are two critical capabilities in surveillance video analysis, which have been largely addressed separately in the literature. In this paper, we propose an approach to simultaneously localize and recognize visual events from raw surveillance videos, employing an end-to-end learning strategy. Our approach formulates the task as weakly-supervised sequential semantic segmentation, in which we utilize a specific convolutional RNN to capture not only the appearance and the motion information but also their temporal evolution patterns. We tested our approach on the VIRAT 2.0 dataset. The experimental results, in comparison with relevant existing state-of-the-art, suggest that the proposed approach is promising in delivering a practical solution.

AB - The ubiquity of video-based surveillance demands automated approaches to analysis of ever-increasing video footages. Action/Event localization and recognition are two critical capabilities in surveillance video analysis, which have been largely addressed separately in the literature. In this paper, we propose an approach to simultaneously localize and recognize visual events from raw surveillance videos, employing an end-to-end learning strategy. Our approach formulates the task as weakly-supervised sequential semantic segmentation, in which we utilize a specific convolutional RNN to capture not only the appearance and the motion information but also their temporal evolution patterns. We tested our approach on the VIRAT 2.0 dataset. The experimental results, in comparison with relevant existing state-of-the-art, suggest that the proposed approach is promising in delivering a practical solution.

UR - http://www.scopus.com/inward/record.url?scp=85063264032&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063264032&partnerID=8YFLogxK

U2 - 10.1109/AVSS.2018.8639169

DO - 10.1109/AVSS.2018.8639169

M3 - Conference contribution

AN - SCOPUS:85063264032

T3 - Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance

BT - Proceedings of AVSS 2018 - 2018 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 15th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2018

Y2 - 27 November 2018 through 30 November 2018

ER -

Simultaneous Event Localization and Recognition in Surveillance Video

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this