A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization

Tyler Sypherd, Mario Diaz, John Kevin Cava, Gautam Dasarathy, Peter Kairouz, Lalitha Sankar

Research output: Contribution to journalArticlepeer-review

Abstract

We introduce a tunable loss function called α-loss, parameterized by α ∈ (0,∞], which interpolates between the exponential loss (α = 1/2), the log-loss (α = 1), and the 0-1 loss (α = ∞), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between α-loss and Arimoto conditional entropy, verify the classificationcalibration of α-loss in order to demonstrate asymptotic optimality via Rademacher complexity generalization techniques, and build-upon a notion called strictly local quasi-convexity in order to quantitatively characterize the optimization landscape of α-loss. Practically, we perform class imbalance, robustness, and classification experiments on benchmark image datasets using convolutional-neural-networks. Our main practical conclusion is that certain tasks may benefit from tuning α-loss away from logloss (α = 1), and to this end we provide simple heuristics for the practitioner. In particular, navigating the α hyperparameter can readily provide superior model robustness to label flips (α > 1) and sensitivity to imbalanced classes (α < 1).

Original languageEnglish (US)
JournalIEEE Transactions on Information Theory
DOIs
StateAccepted/In press - 2022

Keywords

  • α-loss
  • Arimoto conditional entropy
  • Classification algorithms
  • classification-calibration
  • Entropy
  • generalization
  • Logistics
  • Noise measurement
  • Optimization
  • Privacy
  • robustness
  • Robustness
  • strictly local quasi-convexity

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization'. Together they form a unique fingerprint.

Cite this