Regular policies in abstract dynamic programminG

Dimitri P. Bertsekas

doi:10.1137/16M1090946

Regular policies in abstract dynamic programminG

Dimitri P. Bertsekas

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central in the theory of stochastic shortest path problems. We show that the optimal cost function over regular policies may have favorable value and policy iteration properties, which the optimal cost function over all policies need not have. We accordingly develop a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems.

Original language	English (US)
Pages (from-to)	1694-1727
Number of pages	34
Journal	SIAM Journal on Optimization
Volume	27
Issue number	3
DOIs	https://doi.org/10.1137/16M1090946
State	Published - 2017
Externally published	Yes

Keywords

Abstract dynamic programming
Discrete-time optimal control
Policy iteration
Shortest path
Value iteration

ASJC Scopus subject areas

Software
Theoretical Computer Science
Applied Mathematics

Access to Document

10.1137/16M1090946

Cite this

@article{44e9abcf380e4c60ba469debe3264bd9,

title = "Regular policies in abstract dynamic programminG",

abstract = "We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central in the theory of stochastic shortest path problems. We show that the optimal cost function over regular policies may have favorable value and policy iteration properties, which the optimal cost function over all policies need not have. We accordingly develop a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems.",

keywords = "Abstract dynamic programming, Discrete-time optimal control, Policy iteration, Shortest path, Value iteration",

author = "Bertsekas, {Dimitri P.}",

note = "Publisher Copyright: {\textcopyright} 2017 Society for Industrial and Applied Mathematics",

year = "2017",

doi = "10.1137/16M1090946",

language = "English (US)",

volume = "27",

pages = "1694--1727",

journal = "SIAM Journal on Optimization",

issn = "1052-6234",

publisher = "Society for Industrial and Applied Mathematics Publications",

number = "3",

}

TY - JOUR

T1 - Regular policies in abstract dynamic programminG

AU - Bertsekas, Dimitri P.

PY - 2017

Y1 - 2017

N2 - We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central in the theory of stochastic shortest path problems. We show that the optimal cost function over regular policies may have favorable value and policy iteration properties, which the optimal cost function over all policies need not have. We accordingly develop a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems.

AB - We consider challenging dynamic programming models where the associated Bellman equation, and the value and policy iteration algorithms commonly exhibit complex and even pathological behavior. Our analysis is based on the new notion of regular policies. These are policies that are well-behaved with respect to value and policy iteration, and are patterned after proper policies, which are central in the theory of stochastic shortest path problems. We show that the optimal cost function over regular policies may have favorable value and policy iteration properties, which the optimal cost function over all policies need not have. We accordingly develop a unifying methodology to address long standing analytical and algorithmic issues in broad classes of undiscounted models, including stochastic and minimax shortest path problems, as well as positive cost, negative cost, risk-sensitive, and multiplicative cost problems.

KW - Abstract dynamic programming

KW - Discrete-time optimal control

KW - Policy iteration

KW - Shortest path

KW - Value iteration

UR - http://www.scopus.com/inward/record.url?scp=85032877848&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032877848&partnerID=8YFLogxK

U2 - 10.1137/16M1090946

DO - 10.1137/16M1090946

M3 - Article

AN - SCOPUS:85032877848

SN - 1052-6234

VL - 27

SP - 1694

EP - 1727

JO - SIAM Journal on Optimization

JF - SIAM Journal on Optimization

IS - 3

ER -

Regular policies in abstract dynamic programminG

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this