Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web

Yezhou Yang; Yi Li; Cornelia Fermüller; Yiannis Aloimonos

Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web

Yezhou Yang, Yi Li, Cornelia Fermüller, Yiannis Aloimonos

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation. Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by "watching" unconstrained videos with high accuracy.

Original language	English (US)
Title of host publication	Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
Publisher	AI Access Foundation
Pages	3686-3692
Number of pages	7
ISBN (Electronic)	9781577357032
State	Published - Jun 1 2015
Externally published	Yes
Event	29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 - Austin, United States Duration: Jan 25 2015 → Jan 30 2015

Publication series

Name	Proceedings of the National Conference on Artificial Intelligence
Volume	5

Other

Other	29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
Country/Territory	United States
City	Austin
Period	1/25/15 → 1/30/15

ASJC Scopus subject areas

Software
Artificial Intelligence

Cite this

Yang, Y., Li, Y., Fermüller, C., & Aloimonos, Y. (2015). Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 (pp. 3686-3692). (Proceedings of the National Conference on Artificial Intelligence; Vol. 5). AI Access Foundation.

Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web. / Yang, Yezhou; Li, Yi; Fermüller, Cornelia et al.
Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. AI Access Foundation, 2015. p. 3686-3692 (Proceedings of the National Conference on Artificial Intelligence; Vol. 5).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Yang, Y, Li, Y, Fermüller, C & Aloimonos, Y 2015, Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web. in Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. Proceedings of the National Conference on Artificial Intelligence, vol. 5, AI Access Foundation, pp. 3686-3692, 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015, Austin, United States, 1/25/15.

Yang Y, Li Y, Fermüller C, Aloimonos Y. Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. AI Access Foundation. 2015. p. 3686-3692. (Proceedings of the National Conference on Artificial Intelligence).

Yang, Yezhou ; Li, Yi ; Fermüller, Cornelia et al. / Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web. Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015. AI Access Foundation, 2015. pp. 3686-3692 (Proceedings of the National Conference on Artificial Intelligence).

@inproceedings{d007ac990db74a7a9216ea26443f04da,

title = "Robot learning manipulation action plans by {"}watching{"} unconstrained videos from the World Wide Web",

abstract = "In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation. Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by {"}watching{"} unconstrained videos with high accuracy.",

author = "Yezhou Yang and Yi Li and Cornelia Ferm{\"u}ller and Yiannis Aloimonos",

note = "Publisher Copyright: {\textcopyright} Copyright 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 ; Conference date: 25-01-2015 Through 30-01-2015",

year = "2015",

month = jun,

day = "1",

language = "English (US)",

series = "Proceedings of the National Conference on Artificial Intelligence",

publisher = "AI Access Foundation",

pages = "3686--3692",

booktitle = "Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015",

}

TY - GEN

T1 - Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web

AU - Yang, Yezhou

AU - Li, Yi

AU - Fermüller, Cornelia

AU - Aloimonos, Yiannis

PY - 2015/6/1

Y1 - 2015/6/1

N2 - In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation. Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by "watching" unconstrained videos with high accuracy.

AB - In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation. Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by "watching" unconstrained videos with high accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84961223498&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961223498&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84961223498

T3 - Proceedings of the National Conference on Artificial Intelligence

SP - 3686

EP - 3692

BT - Proceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015

PB - AI Access Foundation

T2 - 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015

Y2 - 25 January 2015 through 30 January 2015

ER -

Robot learning manipulation action plans by "watching" unconstrained videos from the World Wide Web

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this