CS-VQA: Visual Question Answering with Compressively Sensed Images

Li Chi Huang; Kuldeep Kulkarni; Anik Jha; Suhas Lohit; Suren Jayasuriya; Pavan Turaga

doi:10.1109/ICIP.2018.8451445

CS-VQA: Visual Question Answering with Compressively Sensed Images

Li Chi Huang, Kuldeep Kulkarni, Anik Jha, Suhas Lohit, Suren Jayasuriya, Pavan Turaga

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

7 Scopus citations

Abstract

Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

Original language	English (US)
Title of host publication	2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings
Publisher	IEEE Computer Society
Pages	1283-1287
Number of pages	5
ISBN (Electronic)	9781479970612
DOIs	https://doi.org/10.1109/ICIP.2018.8451445
State	Published - Aug 29 2018
Event	25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece Duration: Oct 7 2018 → Oct 10 2018

Publication series

Name	Proceedings - International Conference on Image Processing, ICIP
ISSN (Print)	1522-4880

Conference

Conference	25th IEEE International Conference on Image Processing, ICIP 2018
Country/Territory	Greece
City	Athens
Period	10/7/18 → 10/10/18

Keywords

Compressed sensing
Computer vision
Image reconstruction
Multi-layer neural network

ASJC Scopus subject areas

Software
Computer Vision and Pattern Recognition
Signal Processing

Access to Document

10.1109/ICIP.2018.8451445

Cite this

Huang, L. C., Kulkarni, K., Jha, A., Lohit, S., Jayasuriya, S., & Turaga, P. (2018). CS-VQA: Visual Question Answering with Compressively Sensed Images. In 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings (pp. 1283-1287). Article 8451445 (Proceedings - International Conference on Image Processing, ICIP). IEEE Computer Society. https://doi.org/10.1109/ICIP.2018.8451445

CS-VQA: Visual Question Answering with Compressively Sensed Images. / Huang, Li Chi; Kulkarni, Kuldeep; Jha, Anik et al.
2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE Computer Society, 2018. p. 1283-1287 8451445 (Proceedings - International Conference on Image Processing, ICIP).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Huang, LC, Kulkarni, K, Jha, A, Lohit, S, Jayasuriya, S & Turaga, P 2018, CS-VQA: Visual Question Answering with Compressively Sensed Images. in 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings., 8451445, Proceedings - International Conference on Image Processing, ICIP, IEEE Computer Society, pp. 1283-1287, 25th IEEE International Conference on Image Processing, ICIP 2018, Athens, Greece, 10/7/18. https://doi.org/10.1109/ICIP.2018.8451445

@inproceedings{8ac6075ddec74e4eb4b2cd2a2a0dd1ed,

title = "CS-VQA: Visual Question Answering with Compressively Sensed Images",

abstract = "Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.",

keywords = "Compressed sensing, Computer vision, Image reconstruction, Multi-layer neural network",

author = "Huang, {Li Chi} and Kuldeep Kulkarni and Anik Jha and Suhas Lohit and Suren Jayasuriya and Pavan Turaga",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 25th IEEE International Conference on Image Processing, ICIP 2018 ; Conference date: 07-10-2018 Through 10-10-2018",

year = "2018",

month = aug,

day = "29",

doi = "10.1109/ICIP.2018.8451445",

language = "English (US)",

series = "Proceedings - International Conference on Image Processing, ICIP",

publisher = "IEEE Computer Society",

pages = "1283--1287",

booktitle = "2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings",

}

TY - GEN

T1 - CS-VQA

T2 - 25th IEEE International Conference on Image Processing, ICIP 2018

AU - Huang, Li Chi

AU - Kulkarni, Kuldeep

AU - Jha, Anik

AU - Lohit, Suhas

AU - Jayasuriya, Suren

AU - Turaga, Pavan

PY - 2018/8/29

Y1 - 2018/8/29

N2 - Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

AB - Visual Question Answering (VQA) is a complex semantic task requiring both natural language processing and visual recognition. In this paper, we explore whether VQA is solvable when images are captured in a sub-Nyquist compressive paradigm. We develop a series of deep-network architectures that exploit available compressive data to increasing degrees of accuracy, and show that VQA is indeed solvable in the compressed domain. Our results show that there is nominal degradation in VQA performance when using compressive measurements, but that accuracy can be recovered when VQA pipelines are used in conjunction with state-of-the-art deep neural networks for CS reconstruction. The results presented yield important implications for resource-constrained VQA applications.

KW - Compressed sensing

KW - Computer vision

KW - Image reconstruction

KW - Multi-layer neural network

UR - http://www.scopus.com/inward/record.url?scp=85062913838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062913838&partnerID=8YFLogxK

U2 - 10.1109/ICIP.2018.8451445

DO - 10.1109/ICIP.2018.8451445

M3 - Conference contribution

AN - SCOPUS:85062913838

T3 - Proceedings - International Conference on Image Processing, ICIP

SP - 1283

EP - 1287

BT - 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings

PB - IEEE Computer Society

Y2 - 7 October 2018 through 10 October 2018

ER -

CS-VQA: Visual Question Answering with Compressively Sensed Images

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this