CS-VQA

Visual Question Answering with Compressively Sensed Images

Li Chi Huang, Kuldeep Kulkarni, Anik Jha, Suhas Lohit, Suren Jayasuriya, Pavan Turaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings
PublisherIEEE Computer Society
Pages1283-1287
Number of pages5
ISBN (Electronic)9781479970612
DOIs
StatePublished - Aug 29 2018
Event25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece
Duration: Oct 7 2018Oct 10 2018

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference25th IEEE International Conference on Image Processing, ICIP 2018
CountryGreece
CityAthens
Period10/7/1810/10/18

Fingerprint

Network architecture
Pipelines
Semantics
Degradation
Processing
Deep neural networks

Keywords

  • Compressed sensing
  • Computer vision
  • Image reconstruction
  • Multi-layer neural network

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Cite this

Huang, L. C., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., & Turaga, P. (2018). CS-VQA: Visual Question Answering with Compressively Sensed Images. In 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings (pp. 1283-1287). [8451445] (Proceedings - International Conference on Image Processing, ICIP). IEEE Computer Society. https://doi.org/10.1109/ICIP.2018.8451445

CS-VQA : Visual Question Answering with Compressively Sensed Images. / Huang, Li Chi; Kulkarni, Kuldeep; Jha, Anik; Lohit, Suhas; Jayasuriya, Suren; Turaga, Pavan.

2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society, 2018. p. 1283-1287 8451445 (Proceedings - International Conference on Image Processing, ICIP).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Huang, LC, Kulkarni, K, Jha, A, Lohit, S, Jayasuriya, S & Turaga, P 2018, CS-VQA: Visual Question Answering with Compressively Sensed Images. in 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings., 8451445, Proceedings - International Conference on Image Processing, ICIP, IEEE Computer Society, pp. 1283-1287, 25th IEEE International Conference on Image Processing, ICIP 2018, Athens, Greece, 10/7/18. https://doi.org/10.1109/ICIP.2018.8451445
Huang LC, Kulkarni K, Jha A, Lohit S, Jayasuriya S, Turaga P. CS-VQA: Visual Question Answering with Compressively Sensed Images. In 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society. 2018. p. 1283-1287. 8451445. (Proceedings - International Conference on Image Processing, ICIP). https://doi.org/10.1109/ICIP.2018.8451445
Huang, Li Chi ; Kulkarni, Kuldeep ; Jha, Anik ; Lohit, Suhas ; Jayasuriya, Suren ; Turaga, Pavan. / CS-VQA : Visual Question Answering with Compressively Sensed Images. 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society, 2018. pp. 1283-1287 (Proceedings - International Conference on Image Processing, ICIP).
@inproceedings{8ac6075ddec74e4eb4b2cd2a2a0dd1ed,
title = "CS-VQA: Visual Question Answering with Compressively Sensed Images",
abstract = "Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.",
keywords = "Compressed sensing, Computer vision, Image reconstruction, Multi-layer neural network",
author = "Huang, {Li Chi} and Kuldeep Kulkarni and Anik Jha and Suhas Lohit and Suren Jayasuriya and Pavan Turaga",
year = "2018",
month = "8",
day = "29",
doi = "10.1109/ICIP.2018.8451445",
language = "English (US)",
series = "Proceedings - International Conference on Image Processing, ICIP",
publisher = "IEEE Computer Society",
pages = "1283--1287",
booktitle = "2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings",

}

TY - GEN

T1 - CS-VQA

T2 - Visual Question Answering with Compressively Sensed Images

AU - Huang, Li Chi

AU - Kulkarni, Kuldeep

AU - Jha, Anik

AU - Lohit, Suhas

AU - Jayasuriya, Suren

AU - Turaga, Pavan

PY - 2018/8/29

Y1 - 2018/8/29

N2 - Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

AB - Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

KW - Compressed sensing

KW - Computer vision

KW - Image reconstruction

KW - Multi-layer neural network

UR - http://www.scopus.com/inward/record.url?scp=85062913838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062913838&partnerID=8YFLogxK

U2 - 10.1109/ICIP.2018.8451445

DO - 10.1109/ICIP.2018.8451445

M3 - Conference contribution

T3 - Proceedings - International Conference on Image Processing, ICIP

SP - 1283

EP - 1287

BT - 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings

PB - IEEE Computer Society

ER -