Hypothesis Testing under Mutual Information Privacy Constraints in the High Privacy Regime

Jiachun Liao, Lalitha Sankar, Vincent Y.F. Tan, Flavio du Pin Calmon

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Hypothesis testing is a statistical inference framework for determining the true distribution among a set of possible distributions for a given dataset. Privacy restrictions may require the curator of the data or the respondents themselves to share data with the test only after applying a randomizing privacy mechanism. This work considers mutual information (MI) as the privacy metric for measuring leakage. In addition, motivated by the Chernoff-Stein lemma, the relative entropy between pairs of distributions of the output (generated by the privacy mechanism) is chosen as the utility metric. For these metrics, the goal is to find the optimal privacy-utility trade-off (PUT) and the corresponding optimal privacy mechanism for both binary and m-ary hypothesis testing. Focusing on the high privacy regime, Euclidean information-theoretic approximations of the binary and m-ary PUT problems are developed. The solutions for the approximation problems clarify that an MI-based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihoods.

Original languageEnglish (US)
JournalIEEE Transactions on Information Forensics and Security
DOIs
StateAccepted/In press - Nov 29 2017

Fingerprint

Testing
Entropy

Keywords

  • Data privacy
  • Entropy
  • Euclidean information theory
  • Hypothesis testing
  • Measurement
  • Mutual information
  • mutual information
  • Privacy
  • privacy mechanism
  • privacy-guaranteed data publishing
  • Publishing
  • Rényi divergence
  • relative entropy
  • Testing

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications

Cite this

Hypothesis Testing under Mutual Information Privacy Constraints in the High Privacy Regime. / Liao, Jiachun; Sankar, Lalitha; Tan, Vincent Y.F.; Calmon, Flavio du Pin.

In: IEEE Transactions on Information Forensics and Security, 29.11.2017.

Research output: Contribution to journalArticle

@article{320db78c05f04797a00f46d33de4dacf,
title = "Hypothesis Testing under Mutual Information Privacy Constraints in the High Privacy Regime",
abstract = "Hypothesis testing is a statistical inference framework for determining the true distribution among a set of possible distributions for a given dataset. Privacy restrictions may require the curator of the data or the respondents themselves to share data with the test only after applying a randomizing privacy mechanism. This work considers mutual information (MI) as the privacy metric for measuring leakage. In addition, motivated by the Chernoff-Stein lemma, the relative entropy between pairs of distributions of the output (generated by the privacy mechanism) is chosen as the utility metric. For these metrics, the goal is to find the optimal privacy-utility trade-off (PUT) and the corresponding optimal privacy mechanism for both binary and m-ary hypothesis testing. Focusing on the high privacy regime, Euclidean information-theoretic approximations of the binary and m-ary PUT problems are developed. The solutions for the approximation problems clarify that an MI-based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihoods.",
keywords = "Data privacy, Entropy, Euclidean information theory, Hypothesis testing, Measurement, Mutual information, mutual information, Privacy, privacy mechanism, privacy-guaranteed data publishing, Publishing, Rényi divergence, relative entropy, Testing",
author = "Jiachun Liao and Lalitha Sankar and Tan, {Vincent Y.F.} and Calmon, {Flavio du Pin}",
year = "2017",
month = "11",
day = "29",
doi = "10.1109/TIFS.2017.2779108",
language = "English (US)",
journal = "IEEE Transactions on Information Forensics and Security",
issn = "1556-6013",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Hypothesis Testing under Mutual Information Privacy Constraints in the High Privacy Regime

AU - Liao, Jiachun

AU - Sankar, Lalitha

AU - Tan, Vincent Y.F.

AU - Calmon, Flavio du Pin

PY - 2017/11/29

Y1 - 2017/11/29

N2 - Hypothesis testing is a statistical inference framework for determining the true distribution among a set of possible distributions for a given dataset. Privacy restrictions may require the curator of the data or the respondents themselves to share data with the test only after applying a randomizing privacy mechanism. This work considers mutual information (MI) as the privacy metric for measuring leakage. In addition, motivated by the Chernoff-Stein lemma, the relative entropy between pairs of distributions of the output (generated by the privacy mechanism) is chosen as the utility metric. For these metrics, the goal is to find the optimal privacy-utility trade-off (PUT) and the corresponding optimal privacy mechanism for both binary and m-ary hypothesis testing. Focusing on the high privacy regime, Euclidean information-theoretic approximations of the binary and m-ary PUT problems are developed. The solutions for the approximation problems clarify that an MI-based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihoods.

AB - Hypothesis testing is a statistical inference framework for determining the true distribution among a set of possible distributions for a given dataset. Privacy restrictions may require the curator of the data or the respondents themselves to share data with the test only after applying a randomizing privacy mechanism. This work considers mutual information (MI) as the privacy metric for measuring leakage. In addition, motivated by the Chernoff-Stein lemma, the relative entropy between pairs of distributions of the output (generated by the privacy mechanism) is chosen as the utility metric. For these metrics, the goal is to find the optimal privacy-utility trade-off (PUT) and the corresponding optimal privacy mechanism for both binary and m-ary hypothesis testing. Focusing on the high privacy regime, Euclidean information-theoretic approximations of the binary and m-ary PUT problems are developed. The solutions for the approximation problems clarify that an MI-based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihoods.

KW - Data privacy

KW - Entropy

KW - Euclidean information theory

KW - Hypothesis testing

KW - Measurement

KW - Mutual information

KW - mutual information

KW - Privacy

KW - privacy mechanism

KW - privacy-guaranteed data publishing

KW - Publishing

KW - Rényi divergence

KW - relative entropy

KW - Testing

UR - http://www.scopus.com/inward/record.url?scp=85037672022&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037672022&partnerID=8YFLogxK

U2 - 10.1109/TIFS.2017.2779108

DO - 10.1109/TIFS.2017.2779108

M3 - Article

AN - SCOPUS:85037672022

JO - IEEE Transactions on Information Forensics and Security

JF - IEEE Transactions on Information Forensics and Security

SN - 1556-6013

ER -