TY - GEN
T1 - A corpus-guided framework for robotic visual perception
AU - Teo, Ching L.
AU - Yang, Yezhou
AU - Daumé, Hal
AU - Fermüller, Cornelia
AU - Aloimonos, Yiannis
PY - 2011
Y1 - 2011
N2 - We present a framework that produces sentence-level summarizations of videos containing complex human activities that can be implemented as part of the Robot Perception Control Unit (RPCU). This is done via: 1) detection of pertinent objects in the scene: tools and direct-objects, 2) predicting actions guided by a large lexical corpus and 3) generating the most likely sentence description of the video given the detections. We pursue an active object detection approach by focusing on regions of high optical flow. Next, an iterative EM strategy, guided by language, is used to predict the possible actions. Finally, we model the sentence generation process as a HMM optimization problem, combining visual detections and a trained language model to produce a readable description of the video. Experimental results validate our approach and we discuss the implications of our approach to the RPCU in future applications.
AB - We present a framework that produces sentence-level summarizations of videos containing complex human activities that can be implemented as part of the Robot Perception Control Unit (RPCU). This is done via: 1) detection of pertinent objects in the scene: tools and direct-objects, 2) predicting actions guided by a large lexical corpus and 3) generating the most likely sentence description of the video given the detections. We pursue an active object detection approach by focusing on regions of high optical flow. Next, an iterative EM strategy, guided by language, is used to predict the possible actions. Finally, we model the sentence generation process as a HMM optimization problem, combining visual detections and a trained language model to produce a readable description of the video. Experimental results validate our approach and we discuss the implications of our approach to the RPCU in future applications.
UR - http://www.scopus.com/inward/record.url?scp=80055059199&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80055059199&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:80055059199
SN - 9781577355304
T3 - AAAI Workshop - Technical Report
SP - 36
EP - 42
BT - Language-Action Tools for Cognitive Artificial Agents
T2 - 2011 AAAI Workshop
Y2 - 7 August 2011 through 8 August 2011
ER -