Continual Learning of Generative Models With Limited Data: From Wasserstein-1 Barycenter to Adaptive Coalescence

Mehmet Dedeoglu, Sen Lin, Zhaofeng Zhang, Junshan Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Learning generative models is challenging for a network edge node with limited data and computing power. Since tasks in similar environments share a model similarity, it is plausible to leverage pretrained generative models from other edge nodes. Appealing to optimal transport theory tailored toward Wasserstein-1 generative adversarial networks (WGANs), this study aims to develop a framework that systematically optimizes continual learning of generative models using local data at the edge node while exploiting adaptive coalescence of pretrained generative models. Specifically, by treating the knowledge transfer from other nodes as Wasserstein balls centered around their pretrained models, continual learning of generative models is cast as a constrained optimization problem, which is further reduced to a Wasserstein-1 barycenter problem. A two-stage approach is devised accordingly: 1) the barycenters among the pretrained models are computed offline, where displacement interpolation is used as the theoretic foundation for finding adaptive barycenters via a “recursive” WGAN configuration and 2) the barycenter computed offline is used as metamodel initialization for continual learning, and then, fast adaptation is carried out to find the generative model using the local samples at the target edge node. Finally, a weight ternarization method, based on joint optimization of weights and threshold for quantization, is developed to compress the generative model further. Extensive experimental studies corroborate the effectiveness of the proposed framework.

Original languageEnglish (US)
Pages (from-to)1-15
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
StateAccepted/In press - 2023
Externally publishedYes

Keywords

  • Adaptation models
  • Computational modeling
  • Continual learning
  • Data models
  • generative adversarial networks (GANs)
  • optimal transport theory
  • Optimization
  • Servers
  • Solid modeling
  • Task analysis
  • Wasserstein barycenters

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Continual Learning of Generative Models With Limited Data: From Wasserstein-1 Barycenter to Adaptive Coalescence'. Together they form a unique fingerprint.

Cite this