Markovian state and action abstractions for MDPs via hierarchical MCTS

Aijun Bai; Siddharth Srivastava; Stuart Russell

Markovian state and action abstractions for MDPs via hierarchical MCTS

Aijun Bai, Siddharth Srivastava, Stuart Russell

Research output: Contribution to journal › Conference article › peer-review

Abstract

State abstraction is an important technique for scaling MDP algorithms. As is well known, however, it introduces difficulties due to the non-Markovian nature of state-abstracted models. Whereas prior approaches rely upon ad hoc fixes for this issue, we propose instead to view the state-abstracted model as a POMDP and show that we can thereby take advantage of state abstraction without sacrificing the Markov property. We further exploit the hierarchical structure introduced by state abstraction by extending the theory of options to a POMDP setting. In this context we propose a hierarchical Monte Carlo tree search algorithm and show that it converges to a recursively optimal hierarchical policy. Both theoretical and empirical results suggest that abstracting an MDP into a POMDP yields a scalable solution approach.

Original language	English (US)
Pages (from-to)	3029-3037
Number of pages	9
Journal	IJCAI International Joint Conference on Artificial Intelligence
Volume	2016-January
State	Published - 2016
Externally published	Yes
Event	25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, United States Duration: Jul 9 2016 → Jul 15 2016

ASJC Scopus subject areas

Artificial Intelligence

Cite this

@article{918c5fb8894247c28e87e246abbfb01b,

title = "Markovian state and action abstractions for MDPs via hierarchical MCTS",

abstract = "State abstraction is an important technique for scaling MDP algorithms. As is well known, however, it introduces difficulties due to the non-Markovian nature of state-abstracted models. Whereas prior approaches rely upon ad hoc fixes for this issue, we propose instead to view the state-abstracted model as a POMDP and show that we can thereby take advantage of state abstraction without sacrificing the Markov property. We further exploit the hierarchical structure introduced by state abstraction by extending the theory of options to a POMDP setting. In this context we propose a hierarchical Monte Carlo tree search algorithm and show that it converges to a recursively optimal hierarchical policy. Both theoretical and empirical results suggest that abstracting an MDP into a POMDP yields a scalable solution approach.",

author = "Aijun Bai and Siddharth Srivastava and Stuart Russell",

year = "2016",

language = "English (US)",

volume = "2016-January",

pages = "3029--3037",

journal = "IJCAI International Joint Conference on Artificial Intelligence",

issn = "1045-0823",

note = "25th International Joint Conference on Artificial Intelligence, IJCAI 2016 ; Conference date: 09-07-2016 Through 15-07-2016",

}

TY - JOUR

T1 - Markovian state and action abstractions for MDPs via hierarchical MCTS

AU - Bai, Aijun

AU - Srivastava, Siddharth

AU - Russell, Stuart

PY - 2016

Y1 - 2016

N2 - State abstraction is an important technique for scaling MDP algorithms. As is well known, however, it introduces difficulties due to the non-Markovian nature of state-abstracted models. Whereas prior approaches rely upon ad hoc fixes for this issue, we propose instead to view the state-abstracted model as a POMDP and show that we can thereby take advantage of state abstraction without sacrificing the Markov property. We further exploit the hierarchical structure introduced by state abstraction by extending the theory of options to a POMDP setting. In this context we propose a hierarchical Monte Carlo tree search algorithm and show that it converges to a recursively optimal hierarchical policy. Both theoretical and empirical results suggest that abstracting an MDP into a POMDP yields a scalable solution approach.

AB - State abstraction is an important technique for scaling MDP algorithms. As is well known, however, it introduces difficulties due to the non-Markovian nature of state-abstracted models. Whereas prior approaches rely upon ad hoc fixes for this issue, we propose instead to view the state-abstracted model as a POMDP and show that we can thereby take advantage of state abstraction without sacrificing the Markov property. We further exploit the hierarchical structure introduced by state abstraction by extending the theory of options to a POMDP setting. In this context we propose a hierarchical Monte Carlo tree search algorithm and show that it converges to a recursively optimal hierarchical policy. Both theoretical and empirical results suggest that abstracting an MDP into a POMDP yields a scalable solution approach.

UR - http://www.scopus.com/inward/record.url?scp=85006153479&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006153479&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85006153479

SN - 1045-0823

VL - 2016-January

SP - 3029

EP - 3037

JO - IJCAI International Joint Conference on Artificial Intelligence

JF - IJCAI International Joint Conference on Artificial Intelligence

T2 - 25th International Joint Conference on Artificial Intelligence, IJCAI 2016

Y2 - 9 July 2016 through 15 July 2016

ER -

Markovian state and action abstractions for MDPs via hierarchical MCTS

Abstract

ASJC Scopus subject areas

Other files and links

Cite this