UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.

Steven E. Shreve, Dimitri P. Bertsekas

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

Dynamic programming results concerning existence and characterizations of optimal or nearly optimal policies, convergence of algorithms and characterizations of the optimal cost function have been available for some time but a rigorous proof of these results has required quite restrictive hypotheses, such as countability of the state space, in order to circumvent the inherent measurabilities. The authors show that the use of universally measurable policies in the Borel space framework resolves the measurability issues so that all the basic results of dynamic programming can be obtained in the strongest possible form. In particular, epsilon -optimal policies are shown to exist, the dynamic programming algorithm is defined and conditions and bounds for its convergence to the optimal cost are given. The optimality equation is shown to hold and is used to characterize the optimal cost function and optimal policies.

Original languageEnglish (US)
Pages (from-to)15-30
Number of pages16
JournalMathematics of Operations Research
Volume4
Issue number1
DOIs
StatePublished - 1979
Externally publishedYes

ASJC Scopus subject areas

  • General Mathematics
  • Computer Science Applications
  • Management Science and Operations Research

Fingerprint

Dive into the research topics of 'UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.'. Together they form a unique fingerprint.

Cite this