Human activity encoding and recognition using low-level visual features

Zheshen Wang; Baoxin Li

Human activity encoding and recognition using low-level visual features

Zheshen Wang, Baoxin Li

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.

Original language	English (US)
Title of host publication	IJCAI-09 - Proceedings of the 21st International Joint Conference on Artificial Intelligence
Publisher	International Joint Conferences on Artificial Intelligence
Pages	1876-1882
Number of pages	7
ISBN (Print)	9781577354260
State	Published - Jan 1 2009
Event	21st International Joint Conference on Artificial Intelligence, IJCAI 2009 - Pasadena, United States Duration: Jul 11 2009 → Jul 16 2009

Publication series

Name	IJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)	1045-0823

Conference

Conference	21st International Joint Conference on Artificial Intelligence, IJCAI 2009
Country/Territory	United States
City	Pasadena
Period	7/11/09 → 7/16/09

ASJC Scopus subject areas

Artificial Intelligence

Cite this

Human activity encoding and recognition using low-level visual features. / Wang, Zheshen; Li, Baoxin.
IJCAI-09 - Proceedings of the 21st International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, 2009. p. 1876-1882 (IJCAI International Joint Conference on Artificial Intelligence).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, Z & Li, B 2009, Human activity encoding and recognition using low-level visual features. in IJCAI-09 - Proceedings of the 21st International Joint Conference on Artificial Intelligence. IJCAI International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, pp. 1876-1882, 21st International Joint Conference on Artificial Intelligence, IJCAI 2009, Pasadena, United States, 7/11/09.

@inproceedings{ce1f0280c503419ab69fc1e92e641ba0,

title = "Human activity encoding and recognition using low-level visual features",

abstract = "Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.",

author = "Zheshen Wang and Baoxin Li",

year = "2009",

month = jan,

day = "1",

language = "English (US)",

isbn = "9781577354260",

series = "IJCAI International Joint Conference on Artificial Intelligence",

publisher = "International Joint Conferences on Artificial Intelligence",

pages = "1876--1882",

booktitle = "IJCAI-09 - Proceedings of the 21st International Joint Conference on Artificial Intelligence",

note = "21st International Joint Conference on Artificial Intelligence, IJCAI 2009 ; Conference date: 11-07-2009 Through 16-07-2009",

}

TY - GEN

T1 - Human activity encoding and recognition using low-level visual features

AU - Wang, Zheshen

AU - Li, Baoxin

PY - 2009/1/1

Y1 - 2009/1/1

N2 - Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.

AB - Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.

UR - http://www.scopus.com/inward/record.url?scp=78751692959&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78751692959&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:78751692959

SN - 9781577354260

T3 - IJCAI International Joint Conference on Artificial Intelligence

SP - 1876

EP - 1882

BT - IJCAI-09 - Proceedings of the 21st International Joint Conference on Artificial Intelligence

PB - International Joint Conferences on Artificial Intelligence

T2 - 21st International Joint Conference on Artificial Intelligence, IJCAI 2009

Y2 - 11 July 2009 through 16 July 2009

ER -

Human activity encoding and recognition using low-level visual features

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this