Stable optimal control and semicontractive dynamic programming

Dimitri P. Bertsekas

doi:10.1137/17M1122815

Stable optimal control and semicontractive dynamic programming

Dimitri P. Bertsekas

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

We consider discrete-time infinite horizon deterministic optimal control problems with nonnegative cost per stage, and a destination that is cost free and absorbing. The classical linear-quadratic regulator problem is a special case. Our assumptions are very general, and allow the possibility that the optimal policy May not be stabilizing the system, e.g., May not reach the destination either asymptotically or in a finite number of steps. We introduce a new unifying notion of stable feedback policy, based on perturbation of the cost per stage, which in addition to implying convergence of the generated States to the destination, quantifies the speed of convergence. We consider the properties of two distinct cost functions: J^∗, the overall optimal, and Ĵ, the restricted optimal over just the stable policies. Different classes of stable policies (with different speeds of convergence) May yield different values of Ĵ. We show that for any class of stable policies, Ĵ is a solution of Bellman’s equation, and we characterize the smallest and the largest solutions: they are J^∗, and J⁺, the restricted optimal cost function over the class of (finitely) terminating policies. We also characterize the regions of convergence of various modified versions of value and policy iteration algorithms, as substitutes for the standard algorithms, which May not work in general.

Original language	English (US)
Pages (from-to)	231-252
Number of pages	22
Journal	SIAM Journal on Control and Optimization
Volume	56
Issue number	1
DOIs	https://doi.org/10.1137/17M1122815
State	Published - 2018
Externally published	Yes

Keywords

Discrete-time optimal control
Dynamic programming
Policy iteration
Shortest path
Stable policy
Value iteration

ASJC Scopus subject areas

Control and Optimization
Applied Mathematics

Access to Document

10.1137/17M1122815

Cite this

@article{a468e7b18dc84b44add8946f883412e1,

title = "Stable optimal control and semicontractive dynamic programming",

abstract = "We consider discrete-time infinite horizon deterministic optimal control problems with nonnegative cost per stage, and a destination that is cost free and absorbing. The classical linear-quadratic regulator problem is a special case. Our assumptions are very general, and allow the possibility that the optimal policy May not be stabilizing the system, e.g., May not reach the destination either asymptotically or in a finite number of steps. We introduce a new unifying notion of stable feedback policy, based on perturbation of the cost per stage, which in addition to implying convergence of the generated States to the destination, quantifies the speed of convergence. We consider the properties of two distinct cost functions: J∗, the overall optimal, and Ĵ, the restricted optimal over just the stable policies. Different classes of stable policies (with different speeds of convergence) May yield different values of Ĵ. We show that for any class of stable policies, Ĵ is a solution of Bellman{\textquoteright}s equation, and we characterize the smallest and the largest solutions: they are J∗, and J+, the restricted optimal cost function over the class of (finitely) terminating policies. We also characterize the regions of convergence of various modified versions of value and policy iteration algorithms, as substitutes for the standard algorithms, which May not work in general.",

keywords = "Discrete-time optimal control, Dynamic programming, Policy iteration, Shortest path, Stable policy, Value iteration",

author = "Bertsekas, {Dimitri P.}",

year = "2018",

doi = "10.1137/17M1122815",

language = "English (US)",

volume = "56",

pages = "231--252",

journal = "SIAM Journal on Control and Optimization",

issn = "0363-0129",

publisher = "Society for Industrial and Applied Mathematics Publications",

number = "1",

}

TY - JOUR

T1 - Stable optimal control and semicontractive dynamic programming

AU - Bertsekas, Dimitri P.

PY - 2018

Y1 - 2018

N2 - We consider discrete-time infinite horizon deterministic optimal control problems with nonnegative cost per stage, and a destination that is cost free and absorbing. The classical linear-quadratic regulator problem is a special case. Our assumptions are very general, and allow the possibility that the optimal policy May not be stabilizing the system, e.g., May not reach the destination either asymptotically or in a finite number of steps. We introduce a new unifying notion of stable feedback policy, based on perturbation of the cost per stage, which in addition to implying convergence of the generated States to the destination, quantifies the speed of convergence. We consider the properties of two distinct cost functions: J∗, the overall optimal, and Ĵ, the restricted optimal over just the stable policies. Different classes of stable policies (with different speeds of convergence) May yield different values of Ĵ. We show that for any class of stable policies, Ĵ is a solution of Bellman’s equation, and we characterize the smallest and the largest solutions: they are J∗, and J+, the restricted optimal cost function over the class of (finitely) terminating policies. We also characterize the regions of convergence of various modified versions of value and policy iteration algorithms, as substitutes for the standard algorithms, which May not work in general.

AB - We consider discrete-time infinite horizon deterministic optimal control problems with nonnegative cost per stage, and a destination that is cost free and absorbing. The classical linear-quadratic regulator problem is a special case. Our assumptions are very general, and allow the possibility that the optimal policy May not be stabilizing the system, e.g., May not reach the destination either asymptotically or in a finite number of steps. We introduce a new unifying notion of stable feedback policy, based on perturbation of the cost per stage, which in addition to implying convergence of the generated States to the destination, quantifies the speed of convergence. We consider the properties of two distinct cost functions: J∗, the overall optimal, and Ĵ, the restricted optimal over just the stable policies. Different classes of stable policies (with different speeds of convergence) May yield different values of Ĵ. We show that for any class of stable policies, Ĵ is a solution of Bellman’s equation, and we characterize the smallest and the largest solutions: they are J∗, and J+, the restricted optimal cost function over the class of (finitely) terminating policies. We also characterize the regions of convergence of various modified versions of value and policy iteration algorithms, as substitutes for the standard algorithms, which May not work in general.

KW - Discrete-time optimal control

KW - Dynamic programming

KW - Policy iteration

KW - Shortest path

KW - Stable policy

KW - Value iteration

UR - http://www.scopus.com/inward/record.url?scp=85043520801&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85043520801&partnerID=8YFLogxK

U2 - 10.1137/17M1122815

DO - 10.1137/17M1122815

M3 - Article

AN - SCOPUS:85043520801

SN - 0363-0129

VL - 56

SP - 231

EP - 252

JO - SIAM Journal on Control and Optimization

JF - SIAM Journal on Control and Optimization

IS - 1

ER -

Stable optimal control and semicontractive dynamic programming

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this