A unified framework for temporal difference methods

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.

Original languageEnglish (US)
Title of host publication2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
Pages1-7
Number of pages7
DOIs
StatePublished - 2009
Externally publishedYes
Event2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Nashville, TN, United States
Duration: Mar 30 2009Apr 2 2009

Publication series

Name2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Conference

Conference2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009
CountryUnited States
CityNashville, TN
Period3/30/094/2/09

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Fingerprint Dive into the research topics of 'A unified framework for temporal difference methods'. Together they form a unique fingerprint.

Cite this