Abstract

In recent years the most popular video-based human action recognition methods rely on extracting feature representations using Convolutional Neural Networks (CNN) and then using these representations to classify actions. In this work, we propose a fast and accurate video representation that is derived from the motion-salient region (MSR), which represents features most useful for action labeling. By improving a well-performed foreground detection technique, the region of interest (ROI) corresponding to actors in the foreground in both the appearance and the motion field can be detected under various realistic challenges. Furthermore, we propose a complementary motion salient measure to select a secondary ROI - the major moving part of the human. Accordingly, a MSR-based CNN descriptor (MSR-CNN) is formulated to recognize human action, where the descriptor incorporates appearance and motion features along with tracks of MSR. The computation can be efficiently implemented due to two characteristics: 1) only part of the RGB image and the motion field need to be processed; 2) less data is used as input for the CNN feature extraction. Comparative evaluation on JHMDB and UCF Sports datasets shows that our method outperforms the state-of-the-art in both efficiency and accuracy.

Original languageEnglish (US)
Title of host publication2016 23rd International Conference on Pattern Recognition, ICPR 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3524-3529
Number of pages6
ISBN (Electronic)9781509048472
DOIs
StatePublished - Jan 1 2016
Event23rd International Conference on Pattern Recognition, ICPR 2016 - Cancun, Mexico
Duration: Dec 4 2016Dec 8 2016

Publication series

NameProceedings - International Conference on Pattern Recognition
Volume0
ISSN (Print)1051-4651

Other

Other23rd International Conference on Pattern Recognition, ICPR 2016
CountryMexico
CityCancun
Period12/4/1612/8/16

Keywords

  • Action recognition
  • Convolutional Neural Networks
  • Motion salient regions

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'MSR-CNN: Applying motion salient region based descriptors for action recognition'. Together they form a unique fingerprint.

  • Cite this

    Tu, Z., Cao, J., Li, Y., & Li, B. (2016). MSR-CNN: Applying motion salient region based descriptors for action recognition. In 2016 23rd International Conference on Pattern Recognition, ICPR 2016 (pp. 3524-3529). [7900180] (Proceedings - International Conference on Pattern Recognition; Vol. 0). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICPR.2016.7900180