TY - GEN
T1 - A unified framework for temporal difference methods
AU - Bertsekas, Dimitri P.
PY - 2009
Y1 - 2009
N2 - We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
AB - We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
UR - http://www.scopus.com/inward/record.url?scp=67650502136&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67650502136&partnerID=8YFLogxK
U2 - 10.1109/ADPRL.2009.4927518
DO - 10.1109/ADPRL.2009.4927518
M3 - Conference contribution
AN - SCOPUS:67650502136
SN - 9781424427611
T3 - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
SP - 1
EP - 7
BT - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
T2 - 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009
Y2 - 30 March 2009 through 2 April 2009
ER -