Basis Function Adaptation Methods for Cost Approximation in MDP

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Scopus citations

Abstract

We generalize a basis adaptation method for cost approximation in Markov decision processes (MDP), extending earlier work of Menache, Mannor, and Shimkin. In our context, basis functions are parametrized and their parameters are tuned by minimizing an objective function involving the cost function approximation obtained when a temporal differences (TD) or other method is used. The adaptation scheme involves only low order calculations and can be implemented in a way analogous to policy gradient methods. In the generalized basis adaptation framework we provide extensions to TD methods for nonlinear optimal stopping problems and to alternative cost approximations beyond those based on TD.

Original languageEnglish (US)
Title of host publication2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
Pages74-81
Number of pages8
DOIs
StatePublished - 2009
Externally publishedYes
Event2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Nashville, TN, United States
Duration: Mar 30 2009Apr 2 2009

Publication series

Name2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Conference

Conference2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009
Country/TerritoryUnited States
CityNashville, TN
Period3/30/094/2/09

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Basis Function Adaptation Methods for Cost Approximation in MDP'. Together they form a unique fingerprint.

Cite this