Generic rank-one corrections for value iteration in Markovian decision problems

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Given a linear iteration of the form x := F(x), we consider modified versions of the form x := F(x + γd), where d is a fixed direction, and γ is chosen to minimize the norm of the residual ⊥x + γd - F(x + γd)⊥. We propose ways to choose d so that the convergence rate of the modified iteration is governed by the subdominant eigenvalue of the original. In the special case where F relates to a Markovian decision problem, we obtain a new extrapolation method for value iteration. In particular, our method accelerates the Gauss-Seidel version of the value iteration method for discounted problems in the same way that MacQueen's error bounds accelerate the standard version. Furthermore, our method applies equally well to Markov Renewal and undiscounted problems.

Original languageEnglish (US)
Pages (from-to)111-119
Number of pages9
JournalOperations Research Letters
Volume17
Issue number3
DOIs
StatePublished - Apr 1995
Externally publishedYes

Keywords

  • Dynamic programming
  • Gauss-Seidel method
  • Jacobi method
  • Markovian decision problem
  • Stochastic shortest path
  • Value iteration

ASJC Scopus subject areas

  • Software
  • Management Science and Operations Research
  • Industrial and Manufacturing Engineering
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Generic rank-one corrections for value iteration in Markovian decision problems'. Together they form a unique fingerprint.

Cite this