A unified framework for temporal difference methods

Dimitri P. Bertsekas

doi:10.1109/ADPRL.2009.4927518

A unified framework for temporal difference methods

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.

Original language	English (US)
Title of host publication	2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
Pages	1-7
Number of pages	7
DOIs	https://doi.org/10.1109/ADPRL.2009.4927518
State	Published - 2009
Externally published	Yes
Event	2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Nashville, TN, United States Duration: Mar 30 2009 → Apr 2 2009

Publication series

Name	2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Conference

Conference	2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009
Country/Territory	United States
City	Nashville, TN
Period	3/30/09 → 4/2/09

ASJC Scopus subject areas

Computational Theory and Mathematics
Software

Access to Document

10.1109/ADPRL.2009.4927518

Cite this

A unified framework for temporal difference methods. / Bertsekas, Dimitri P.
2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings. 2009. p. 1-7 4927518 (2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Bertsekas, DP 2009, A unified framework for temporal difference methods. in 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings., 4927518, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings, pp. 1-7, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009, Nashville, TN, United States, 3/30/09. https://doi.org/10.1109/ADPRL.2009.4927518

@inproceedings{40d1bfbecbec4153991080c28bcff1da,

title = "A unified framework for temporal difference methods",

abstract = "We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.",

author = "Bertsekas, {Dimitri P.}",

year = "2009",

doi = "10.1109/ADPRL.2009.4927518",

language = "English (US)",

isbn = "9781424427611",

series = "2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings",

pages = "1--7",

booktitle = "2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings",

note = "2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 ; Conference date: 30-03-2009 Through 02-04-2009",

}

TY - GEN

T1 - A unified framework for temporal difference methods

AU - Bertsekas, Dimitri P.

PY - 2009

Y1 - 2009

N2 - We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.

AB - We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.

UR - http://www.scopus.com/inward/record.url?scp=67650502136&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67650502136&partnerID=8YFLogxK

U2 - 10.1109/ADPRL.2009.4927518

DO - 10.1109/ADPRL.2009.4927518

M3 - Conference contribution

AN - SCOPUS:67650502136

SN - 9781424427611

T3 - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

SP - 1

EP - 7

BT - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

T2 - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009

Y2 - 30 March 2009 through 2 April 2009

ER -

A unified framework for temporal difference methods

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this