Constrained online convex optimization with feedback delays

Xuanyu Cao, Junshan Zhang, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

In this article, we study constrained online convex optimization (OCO) in the presence of feedback delays, where a decision maker chooses sequential actions without knowing the loss functions and constraint functions a priori. The loss/constraint functions vary with time and their feedback information is revealed to the decision maker with delays, which arise in many applications. We first consider the scenario of delayed function feedback, in which the complete information of the loss/constraint functions is revealed to the decision maker with delays. We develop a modified online saddle point algorithm suitable for constrained OCO with feedback delays. Sublinear regret and sublinear constraint violation bounds are established for the algorithm in terms of the delays. In practice, the complete information (functional forms) of the loss/constraint functions may not be revealed to the decision maker. Thus, we further examine the scenario of delayed bandit feedback, where only the values of the loss/constraint functions at two random points close to the chosen action are revealed to the decision maker with delays. A delayed version of the bandit online saddle point algorithm is proposed by utilizing stochastic gradient estimates of the loss/constraint functions based on delayed bandit feedback. We also establish sublinear regret and sublinear constraint violation bounds for this bandit optimization algorithm in terms of the delays. Finally, numerical results for online quadratically constrained quadratic programs are presented to corroborate the efficacy of the proposed algorithms.

Original languageEnglish (US)
Pages (from-to)5049-5064
Number of pages16
JournalIEEE Transactions on Automatic Control
Volume66
Issue number11
DOIs
StatePublished - Nov 1 2021

Keywords

  • Bandit feedback
  • constrained optimization
  • feedback delay
  • function feedback
  • online convex optimization (OCO)

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Constrained online convex optimization with feedback delays'. Together they form a unique fingerprint.

Cite this