Distributed asynchronous policy iteration in dynamic programming

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Scopus citations

Abstract

We consider the distributed solution of dynamic programming (DP) problems by policy iteration. We envision a network of processors, each updating asynchronously a local policy and a local cost function, defined on a portion of the state space. The computed values are communicated asynchronously between processors and are used to perform the local policy and cost updates. The natural algorithm of this type can fail even under favorable circumstances, as shown by Williams and Baird [WiB93]. We propose an alternative and almost as simple algorithm, which converges to the optimum under the most general conditions, including asynchronous updating by multiple processors using outdated local cost functions of other processors.

Original languageEnglish (US)
Title of host publication2010 48th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2010
Pages1368-1375
Number of pages8
DOIs
StatePublished - 2010
Externally publishedYes
Event48th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2010 - Monticello, IL, United States
Duration: Sep 29 2010Oct 1 2010

Publication series

Name2010 48th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2010

Other

Other48th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2010
CountryUnited States
CityMonticello, IL
Period9/29/1010/1/10

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Control and Systems Engineering

Fingerprint Dive into the research topics of 'Distributed asynchronous policy iteration in dynamic programming'. Together they form a unique fingerprint.

Cite this