A model-based approach to visual reasoning on CNLVR dataset

Shailaja Sampat, Joohyung Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Visual Reasoning requires an understanding of complex compositional images and common-sense reasoning about sets of objects, quantities, comparisons, and spatial relationships. This paper presents a semantic parser that combines Computer Vision (CV), Natural Language Processing (NLP) and Knowledge Representation & Reasoning (KRR) to automatically solve visual reasoning problems from the Cornell Natural Language Visual Reasoning (CNLVR) dataset. Unlike the data-driven approaches applied to the same dataset, our system does not require any training but is guided by the knowledge base that is manually constructed. The system demonstrates robust overall performance which is also time and space efficient. Our system achieves 87.3% accuracy, which is 17.6% higher over the state-of-the-art method on raw image representations.

Original languageEnglish (US)
Title of host publicationPrinciples of Knowledge Representation and Reasoning
Subtitle of host publicationProceedings of the 16th International Conference, KR 2018
EditorsMichael Thielscher, Francesca Toni, Frank Wolter
PublisherAAAI press
Pages62-66
Number of pages5
ISBN (Electronic)9781577358039
StatePublished - 2018
Event16th International Conference on the Principles of Knowledge Representation and Reasoning, KR 2018 - Tempe, United States
Duration: Oct 30 2018Nov 2 2018

Publication series

NamePrinciples of Knowledge Representation and Reasoning: Proceedings of the 16th International Conference, KR 2018

Conference

Conference16th International Conference on the Principles of Knowledge Representation and Reasoning, KR 2018
Country/TerritoryUnited States
CityTempe
Period10/30/1811/2/18

ASJC Scopus subject areas

  • Software
  • Logic

Fingerprint

Dive into the research topics of 'A model-based approach to visual reasoning on CNLVR dataset'. Together they form a unique fingerprint.

Cite this