TY - GEN
T1 - Robots with language
T2 - 2013 IEEE International Conference on Robotics and Automation, ICRA 2013
AU - Yang, Yezhou
AU - Teo, Ching L.
AU - Fermuller, Cornelia
AU - Aloimonos, Yiannis
PY - 2013
Y1 - 2013
N2 - There has been a recent interest in utilizing contextual knowledge to improve multi-label visual recognition for intelligent agents like robots. Natural Language Processing (NLP) can give us labels, the correlation of labels, and the ontological knowledge about them, so we can automate the acquisition of contextual knowledge. In this paper we show how to use tools from NLP in conjunction with Vision to improve visual recognition. There are two major approaches: First, different language databases organize words according to various semantic concepts. Using these, we can build special purpose databases that can predict the labels involved given a certain context. Here we build a knowledge base for the purpose of describing common daily activities. Second, statistical language tools can provide the correlations of different labels. We show a way to learn a language model from large corpus data that exploits these correlations and propose a general optimization scheme to integrate the language model into the system. Experiments conducted on three multi-label everyday recognition tasks support the effectiveness and efficiency of our approach, with significant gains in recognition accuracies when correlation information is used.
AB - There has been a recent interest in utilizing contextual knowledge to improve multi-label visual recognition for intelligent agents like robots. Natural Language Processing (NLP) can give us labels, the correlation of labels, and the ontological knowledge about them, so we can automate the acquisition of contextual knowledge. In this paper we show how to use tools from NLP in conjunction with Vision to improve visual recognition. There are two major approaches: First, different language databases organize words according to various semantic concepts. Using these, we can build special purpose databases that can predict the labels involved given a certain context. Here we build a knowledge base for the purpose of describing common daily activities. Second, statistical language tools can provide the correlations of different labels. We show a way to learn a language model from large corpus data that exploits these correlations and propose a general optimization scheme to integrate the language model into the system. Experiments conducted on three multi-label everyday recognition tasks support the effectiveness and efficiency of our approach, with significant gains in recognition accuracies when correlation information is used.
UR - http://www.scopus.com/inward/record.url?scp=84887289565&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84887289565&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2013.6631179
DO - 10.1109/ICRA.2013.6631179
M3 - Conference contribution
AN - SCOPUS:84887289565
SN - 9781467356411
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4256
EP - 4262
BT - 2013 IEEE International Conference on Robotics and Automation, ICRA 2013
Y2 - 6 May 2013 through 10 May 2013
ER -