UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.

Steven E. Shreve; Dimitri P. Bertsekas

doi:10.1287/moor.4.1.15

UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.

Steven E. Shreve, Dimitri P. Bertsekas

Research output: Contribution to journal › Article › peer-review

27 Scopus citations

Abstract

Dynamic programming results concerning existence and characterizations of optimal or nearly optimal policies, convergence of algorithms and characterizations of the optimal cost function have been available for some time but a rigorous proof of these results has required quite restrictive hypotheses, such as countability of the state space, in order to circumvent the inherent measurabilities. The authors show that the use of universally measurable policies in the Borel space framework resolves the measurability issues so that all the basic results of dynamic programming can be obtained in the strongest possible form. In particular, epsilon -optimal policies are shown to exist, the dynamic programming algorithm is defined and conditions and bounds for its convergence to the optimal cost are given. The optimality equation is shown to hold and is used to characterize the optimal cost function and optimal policies.

Original language	English (US)
Pages (from-to)	15-30
Number of pages	16
Journal	Mathematics of Operations Research
Volume	4
Issue number	1
DOIs	https://doi.org/10.1287/moor.4.1.15
State	Published - 1979
Externally published	Yes

ASJC Scopus subject areas

General Mathematics
Computer Science Applications
Management Science and Operations Research

Access to Document

10.1287/moor.4.1.15

Cite this

@article{730556f9ce3f4cb19188a52c93e0ab9a,

title = "UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.",

abstract = "Dynamic programming results concerning existence and characterizations of optimal or nearly optimal policies, convergence of algorithms and characterizations of the optimal cost function have been available for some time but a rigorous proof of these results has required quite restrictive hypotheses, such as countability of the state space, in order to circumvent the inherent measurabilities. The authors show that the use of universally measurable policies in the Borel space framework resolves the measurability issues so that all the basic results of dynamic programming can be obtained in the strongest possible form. In particular, epsilon -optimal policies are shown to exist, the dynamic programming algorithm is defined and conditions and bounds for its convergence to the optimal cost are given. The optimality equation is shown to hold and is used to characterize the optimal cost function and optimal policies.",

author = "Shreve, {Steven E.} and Bertsekas, {Dimitri P.}",

year = "1979",

doi = "10.1287/moor.4.1.15",

language = "English (US)",

volume = "4",

pages = "15--30",

journal = "Mathematics of Operations Research",

issn = "0364-765X",

publisher = "INFORMS Inst.for Operations Res.and the Management Sciences",

number = "1",

}

TY - JOUR

T1 - UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.

AU - Shreve, Steven E.

AU - Bertsekas, Dimitri P.

PY - 1979

Y1 - 1979

N2 - Dynamic programming results concerning existence and characterizations of optimal or nearly optimal policies, convergence of algorithms and characterizations of the optimal cost function have been available for some time but a rigorous proof of these results has required quite restrictive hypotheses, such as countability of the state space, in order to circumvent the inherent measurabilities. The authors show that the use of universally measurable policies in the Borel space framework resolves the measurability issues so that all the basic results of dynamic programming can be obtained in the strongest possible form. In particular, epsilon -optimal policies are shown to exist, the dynamic programming algorithm is defined and conditions and bounds for its convergence to the optimal cost are given. The optimality equation is shown to hold and is used to characterize the optimal cost function and optimal policies.

AB - Dynamic programming results concerning existence and characterizations of optimal or nearly optimal policies, convergence of algorithms and characterizations of the optimal cost function have been available for some time but a rigorous proof of these results has required quite restrictive hypotheses, such as countability of the state space, in order to circumvent the inherent measurabilities. The authors show that the use of universally measurable policies in the Borel space framework resolves the measurability issues so that all the basic results of dynamic programming can be obtained in the strongest possible form. In particular, epsilon -optimal policies are shown to exist, the dynamic programming algorithm is defined and conditions and bounds for its convergence to the optimal cost are given. The optimality equation is shown to hold and is used to characterize the optimal cost function and optimal policies.

UR - http://www.scopus.com/inward/record.url?scp=0018430154&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0018430154&partnerID=8YFLogxK

U2 - 10.1287/moor.4.1.15

DO - 10.1287/moor.4.1.15

M3 - Article

AN - SCOPUS:0018430154

SN - 0364-765X

VL - 4

SP - 15

EP - 30

JO - Mathematics of Operations Research

JF - Mathematics of Operations Research

IS - 1

ER -

UNIVERSALLY MEASURABLE POLICIES IN DYNAMIC PROGRAMMING.

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this