Robots with language: Multi-label visual recognition using NLP

Yezhou Yang, Ching L. Teo, Cornelia Fermuller, Yiannis Aloimonos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

There has been a recent interest in utilizing contextual knowledge to improve multi-label visual recognition for intelligent agents like robots. Natural Language Processing (NLP) can give us labels, the correlation of labels, and the ontological knowledge about them, so we can automate the acquisition of contextual knowledge. In this paper we show how to use tools from NLP in conjunction with Vision to improve visual recognition. There are two major approaches: First, different language databases organize words according to various semantic concepts. Using these, we can build special purpose databases that can predict the labels involved given a certain context. Here we build a knowledge base for the purpose of describing common daily activities. Second, statistical language tools can provide the correlations of different labels. We show a way to learn a language model from large corpus data that exploits these correlations and propose a general optimization scheme to integrate the language model into the system. Experiments conducted on three multi-label everyday recognition tasks support the effectiveness and efficiency of our approach, with significant gains in recognition accuracies when correlation information is used.

Original languageEnglish (US)
Title of host publication2013 IEEE International Conference on Robotics and Automation, ICRA 2013
Pages4256-4262
Number of pages7
DOIs
StatePublished - Nov 14 2013
Externally publishedYes
Event2013 IEEE International Conference on Robotics and Automation, ICRA 2013 - Karlsruhe, Germany
Duration: May 6 2013May 10 2013

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Other

Other2013 IEEE International Conference on Robotics and Automation, ICRA 2013
CountryGermany
CityKarlsruhe
Period5/6/135/10/13

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Robots with language: Multi-label visual recognition using NLP'. Together they form a unique fingerprint.

  • Cite this

    Yang, Y., Teo, C. L., Fermuller, C., & Aloimonos, Y. (2013). Robots with language: Multi-label visual recognition using NLP. In 2013 IEEE International Conference on Robotics and Automation, ICRA 2013 (pp. 4256-4262). [6631179] (Proceedings - IEEE International Conference on Robotics and Automation). https://doi.org/10.1109/ICRA.2013.6631179