'Just because you are right, doesn't mean I am wrong': Overcoming a bottleneck in the development and evaluation of open-ended visual question answering (VQA) tasks

Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

GQA (Hudson and Manning, 2019) is a dataset for real-world visual reasoning and compositional question answering. We found that many answers predicted by the best vision-language models on the GQA dataset do not match the ground-truth answer but still are semantically meaningful and correct in the given context. In fact, this is the case with most existing visual question answering (VQA) datasets where they assume only one ground-truth answer for each question. We propose Alternative Answer Sets (AAS) of ground-truth answers to address this limitation, which is created automatically using off-the-shelf NLP tools. We introduce a semantic metric based on AAS and modify top VQA solvers to support multiple plausible answers for a question. We implement this approach on the GQA dataset and show the performance improvements.

Original languageEnglish (US)
Title of host publicationEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages2766-2771
Number of pages6
ISBN (Electronic)9781954085022
StatePublished - 2021
Event16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online
Duration: Apr 19 2021Apr 23 2021

Publication series

NameEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021
CityVirtual, Online
Period4/19/214/23/21

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint

Dive into the research topics of ''Just because you are right, doesn't mean I am wrong': Overcoming a bottleneck in the development and evaluation of open-ended visual question answering (VQA) tasks'. Together they form a unique fingerprint.

Cite this