A new value iteration method for the average cost dynamic programming problem

Dimitri P. Bertsekas

doi:10.1137/S0363012995291609

A new value iteration method for the average cost dynamic programming problem

Dimitri P. Bertsekas

Research output: Contribution to journal › Article › peer-review

29 Scopus citations

Abstract

We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss-Seidel implementation. Computational tests indicate that the Gauss-Seidel version of the new method substantially outperforms the standard method for difficult problems.

Original language	English (US)
Pages (from-to)	742-759
Number of pages	18
Journal	SIAM Journal on Control and Optimization
Volume	36
Issue number	2
DOIs	https://doi.org/10.1137/S0363012995291609
State	Published - 1998
Externally published	Yes

Keywords

Average cost
Dynamic programming
Value iteration

ASJC Scopus subject areas

Control and Optimization
Applied Mathematics

Access to Document

10.1137/S0363012995291609

Cite this

@article{a1cd5e5d3ff24c649cb5b6415c4da1de,

title = "A new value iteration method for the average cost dynamic programming problem",

abstract = "We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss-Seidel implementation. Computational tests indicate that the Gauss-Seidel version of the new method substantially outperforms the standard method for difficult problems.",

keywords = "Average cost, Dynamic programming, Value iteration",

author = "Bertsekas, {Dimitri P.}",

year = "1998",

doi = "10.1137/S0363012995291609",

language = "English (US)",

volume = "36",

pages = "742--759",

journal = "SIAM Journal on Control and Optimization",

issn = "0363-0129",

publisher = "Society for Industrial and Applied Mathematics Publications",

number = "2",

}

TY - JOUR

T1 - A new value iteration method for the average cost dynamic programming problem

AU - Bertsekas, Dimitri P.

PY - 1998

Y1 - 1998

N2 - We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss-Seidel implementation. Computational tests indicate that the Gauss-Seidel version of the new method substantially outperforms the standard method for difficult problems.

AB - We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss-Seidel implementation. Computational tests indicate that the Gauss-Seidel version of the new method substantially outperforms the standard method for difficult problems.

KW - Average cost

KW - Dynamic programming

KW - Value iteration

UR - http://www.scopus.com/inward/record.url?scp=0032022988&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032022988&partnerID=8YFLogxK

U2 - 10.1137/S0363012995291609

DO - 10.1137/S0363012995291609

M3 - Article

AN - SCOPUS:0032022988

SN - 0363-0129

VL - 36

SP - 742

EP - 759

JO - SIAM Journal on Control and Optimization

JF - SIAM Journal on Control and Optimization

IS - 2

ER -

A new value iteration method for the average cost dynamic programming problem

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this