Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph

Xin Ye, Yezhou Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We present a novel two-layer hierarchical reinforcement learning approach equipped with a Goals Relational Graph (GRG) for tackling the partially observable goal-driven task, such as goal-driven visual navigation. Our GRG captures the underlying relations of all goals in the goal space through a Dirichlet-categorical process that facilitates: 1) the high-level network raising a sub-goal towards achieving a designated final goal; 2) the low-level network towards an optimal policy; and 3) the overall system generalizing unseen environments and goals. We evaluate our approach with two settings of partially observable goal-driven tasks - a grid-world domain and a robotic object search task. Our experimental results show that our approach exhibits superior generalization performance on both unseen environments and new goals.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
PublisherIEEE Computer Society
Pages14096-14105
Number of pages10
ISBN (Electronic)9781665445092
DOIs
StatePublished - 2021
Event2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, United States
Duration: Jun 19 2021Jun 25 2021

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
Country/TerritoryUnited States
CityVirtual, Online
Period6/19/216/25/21

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph'. Together they form a unique fingerprint.

Cite this