Designing Counterfactual Generators using Deep Model Inversion

Jayaraman J. Thiagarajan; Vivek Narayanaswamy; Deepta Rajan; Jason Liang; Akshay Chaudhari; Andreas Spanias

Designing Counterfactual Generators using Deep Model Inversion

Jayaraman J. Thiagarajan, Vivek Narayanaswamy, Deepta Rajan, Jason Liang, Akshay Chaudhari, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

Original language	English (US)
Title of host publication	Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Editors	Marc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
Publisher	Neural information processing systems foundation
Pages	16873-16884
Number of pages	12
ISBN (Electronic)	9781713845393
State	Published - 2021
Event	35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online Duration: Dec 6 2021 → Dec 14 2021

Publication series

Name	Advances in Neural Information Processing Systems
Volume	20
ISSN (Print)	1049-5258

Conference

Conference	35th Conference on Neural Information Processing Systems, NeurIPS 2021
City	Virtual, Online
Period	12/6/21 → 12/14/21

ASJC Scopus subject areas

Computer Networks and Communications
Information Systems
Signal Processing

Cite this

Thiagarajan, J. J., Narayanaswamy, V., Rajan, D., Liang, J., Chaudhari, A., & Spanias, A. (2021). Designing Counterfactual Generators using Deep Model Inversion. In MA. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.), Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021 (pp. 16873-16884). (Advances in Neural Information Processing Systems; Vol. 20). Neural information processing systems foundation.

Designing Counterfactual Generators using Deep Model Inversion. / Thiagarajan, Jayaraman J.; Narayanaswamy, Vivek; Rajan, Deepta et al.
Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021. ed. / Marc'Aurelio Ranzato; Alina Beygelzimer; Yann Dauphin; Percy S. Liang; Jenn Wortman Vaughan. Neural information processing systems foundation, 2021. p. 16873-16884 (Advances in Neural Information Processing Systems; Vol. 20).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Thiagarajan, JJ, Narayanaswamy, V, Rajan, D, Liang, J, Chaudhari, A & Spanias, A 2021, Designing Counterfactual Generators using Deep Model Inversion. in MA Ranzato, A Beygelzimer, Y Dauphin, PS Liang & J Wortman Vaughan (eds), Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021. Advances in Neural Information Processing Systems, vol. 20, Neural information processing systems foundation, pp. 16873-16884, 35th Conference on Neural Information Processing Systems, NeurIPS 2021, Virtual, Online, 12/6/21.

Thiagarajan JJ, Narayanaswamy V, Rajan D, Liang J, Chaudhari A, Spanias A. Designing Counterfactual Generators using Deep Model Inversion. In Ranzato MA, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, editors, Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021. Neural information processing systems foundation. 2021. p. 16873-16884. (Advances in Neural Information Processing Systems).

Thiagarajan, Jayaraman J. ; Narayanaswamy, Vivek ; Rajan, Deepta et al. / Designing Counterfactual Generators using Deep Model Inversion. Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021. editor / Marc'Aurelio Ranzato ; Alina Beygelzimer ; Yann Dauphin ; Percy S. Liang ; Jenn Wortman Vaughan. Neural information processing systems foundation, 2021. pp. 16873-16884 (Advances in Neural Information Processing Systems).

@inproceedings{7bd7967627174605af04499f82e34a16,

title = "Designing Counterfactual Generators using Deep Model Inversion",

abstract = "Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.",

author = "Thiagarajan, {Jayaraman J.} and Vivek Narayanaswamy and Deepta Rajan and Jason Liang and Akshay Chaudhari and Andreas Spanias",

note = "Funding Information: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Publisher Copyright: {\textcopyright} 2021 Neural information processing systems foundation. All rights reserved.; 35th Conference on Neural Information Processing Systems, NeurIPS 2021 ; Conference date: 06-12-2021 Through 14-12-2021",

year = "2021",

language = "English (US)",

series = "Advances in Neural Information Processing Systems",

publisher = "Neural information processing systems foundation",

pages = "16873--16884",

editor = "Marc'Aurelio Ranzato and Alina Beygelzimer and Yann Dauphin and Liang, {Percy S.} and {Wortman Vaughan}, Jenn",

booktitle = "Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021",

}

TY - GEN

T1 - Designing Counterfactual Generators using Deep Model Inversion

AU - Thiagarajan, Jayaraman J.

AU - Narayanaswamy, Vivek

AU - Rajan, Deepta

AU - Liang, Jason

AU - Chaudhari, Akshay

AU - Spanias, Andreas

N1 - Funding Information: This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Publisher Copyright: © 2021 Neural information processing systems foundation. All rights reserved.

PY - 2021

Y1 - 2021

N2 - Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

AB - Explanation techniques that synthesize small, interpretable changes to a given image while producing desired changes in the model prediction have become popular for introspecting black-box models. Commonly referred to as counterfactuals, the synthesized explanations are required to contain discernible changes (for easy interpretability) while also being realistic (consistency to the data manifold). In this paper, we focus on the case where we have access only to the trained deep classifier and not the actual training data. While the problem of inverting deep models to synthesize images from the training distribution has been explored, our goal is to develop a deep inversion approach to generate counterfactual explanations for a given query image. Despite their effectiveness in conditional image synthesis, we show that existing deep inversion methods are insufficient for producing meaningful counterfactuals. We propose DISC (Deep Inversion for Synthesizing Counterfactuals) that improves upon deep inversion by utilizing (a) stronger image priors, (b) incorporating a novel manifold consistency objective and (c) adopting a progressive optimization strategy. We find that, in addition to producing visually meaningful explanations, the counterfactuals from DISC are effective at learning classifier decision boundaries and are robust to unknown test-time corruptions.

UR - http://www.scopus.com/inward/record.url?scp=85128087105&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85128087105&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85128087105

T3 - Advances in Neural Information Processing Systems

SP - 16873

EP - 16884

BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021

A2 - Ranzato, Marc'Aurelio

A2 - Beygelzimer, Alina

A2 - Dauphin, Yann

A2 - Liang, Percy S.

A2 - Wortman Vaughan, Jenn

PB - Neural information processing systems foundation

T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021

Y2 - 6 December 2021 through 14 December 2021

ER -

Designing Counterfactual Generators using Deep Model Inversion

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this