TLdR: Policy summarization for factored SSP problems using temporal abstractions

Sarath Sreedharan; Siddharth Srivastava; Subbarao Kambhampati

TLdR: Policy summarization for factored SSP problems using temporal abstractions

Sarath Sreedharan, Siddharth Srivastava, Subbarao Kambhampati

Research output: Contribution to journal › Conference article › peer-review

Abstract

As more and more people are expected to work with complex AI-systems, it becomes more important than ever that such systems provide intuitive explanations for their decisions. A prerequisite for holding such explanatory dialogue is the ability of the systems to present their proposed decisions to the user in an easy-to-understand form. Unfortunately, such dialogues could become hard to facilitate in real-world problems where the system may be planning for multiple eventualities in stochastic environments. This means for the system to be effective, it needs to be able to present the policy at a high-level of abstraction and delve into details as required. Towards this end, we investigate the utility of temporal abstractions derived through analytically computed landmarks and their relative ordering to build a summarization of policies for Stochastic Shortest Path Problems. We formalize the concept of policy landmarks and show how it can be used to provide a high level overview of a given policy. Additionally, we establish the connections between the type of hierarchy we generate and previous works in temporal abstractions, specifically MaxQ hierarchies. Our approach is evaluated through user studies as well as empirical metrics that establish that people tend to choose landmarks facts as subgoals to summarize policies and demonstrates the performance of our approach on standard benchmarks.

Original language	English (US)
Pages (from-to)	272-280
Number of pages	9
Journal	Proceedings International Conference on Automated Planning and Scheduling, ICAPS
Volume	30
State	Published - May 29 2020
Event	30th International Conference on Automated Planning and Scheduling, ICAPS 2020 - Nancy, France Duration: Oct 26 2020 → Oct 30 2020

ASJC Scopus subject areas

Artificial Intelligence
Computer Science Applications
Information Systems and Management

Cite this

@article{cd02374bd46048918e08d0a4452bd0fe,

title = "TLdR: Policy summarization for factored SSP problems using temporal abstractions",

abstract = "As more and more people are expected to work with complex AI-systems, it becomes more important than ever that such systems provide intuitive explanations for their decisions. A prerequisite for holding such explanatory dialogue is the ability of the systems to present their proposed decisions to the user in an easy-to-understand form. Unfortunately, such dialogues could become hard to facilitate in real-world problems where the system may be planning for multiple eventualities in stochastic environments. This means for the system to be effective, it needs to be able to present the policy at a high-level of abstraction and delve into details as required. Towards this end, we investigate the utility of temporal abstractions derived through analytically computed landmarks and their relative ordering to build a summarization of policies for Stochastic Shortest Path Problems. We formalize the concept of policy landmarks and show how it can be used to provide a high level overview of a given policy. Additionally, we establish the connections between the type of hierarchy we generate and previous works in temporal abstractions, specifically MaxQ hierarchies. Our approach is evaluated through user studies as well as empirical metrics that establish that people tend to choose landmarks facts as subgoals to summarize policies and demonstrates the performance of our approach on standard benchmarks.",

author = "Sarath Sreedharan and Siddharth Srivastava and Subbarao Kambhampati",

note = "Funding Information: This research is supported in part by ONR grants N00014-16-1-2892, N00014-18-1-2442, N00014-18-1-2840, N00014-9-1-2119, AFOSR grant FA9550-18-1-0067, DARPA SAIL-ON grant W911NF-19-2-0006, NSF grants 1936997 (C-ACCEL), 1844325 and 1909370, NASA grant NNX17AD06G, and a JP Morgan AI Faculty Research grant. Publisher Copyright: Copyright {\textcopyright} 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 30th International Conference on Automated Planning and Scheduling, ICAPS 2020 ; Conference date: 26-10-2020 Through 30-10-2020",

year = "2020",

month = may,

day = "29",

language = "English (US)",

volume = "30",

pages = "272--280",

journal = "Proceedings International Conference on Automated Planning and Scheduling, ICAPS",

issn = "2334-0835",

}

TY - JOUR

T1 - TLdR

T2 - 30th International Conference on Automated Planning and Scheduling, ICAPS 2020

AU - Sreedharan, Sarath

AU - Srivastava, Siddharth

AU - Kambhampati, Subbarao

N1 - Funding Information: This research is supported in part by ONR grants N00014-16-1-2892, N00014-18-1-2442, N00014-18-1-2840, N00014-9-1-2119, AFOSR grant FA9550-18-1-0067, DARPA SAIL-ON grant W911NF-19-2-0006, NSF grants 1936997 (C-ACCEL), 1844325 and 1909370, NASA grant NNX17AD06G, and a JP Morgan AI Faculty Research grant. Publisher Copyright: Copyright © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

PY - 2020/5/29

Y1 - 2020/5/29

N2 - As more and more people are expected to work with complex AI-systems, it becomes more important than ever that such systems provide intuitive explanations for their decisions. A prerequisite for holding such explanatory dialogue is the ability of the systems to present their proposed decisions to the user in an easy-to-understand form. Unfortunately, such dialogues could become hard to facilitate in real-world problems where the system may be planning for multiple eventualities in stochastic environments. This means for the system to be effective, it needs to be able to present the policy at a high-level of abstraction and delve into details as required. Towards this end, we investigate the utility of temporal abstractions derived through analytically computed landmarks and their relative ordering to build a summarization of policies for Stochastic Shortest Path Problems. We formalize the concept of policy landmarks and show how it can be used to provide a high level overview of a given policy. Additionally, we establish the connections between the type of hierarchy we generate and previous works in temporal abstractions, specifically MaxQ hierarchies. Our approach is evaluated through user studies as well as empirical metrics that establish that people tend to choose landmarks facts as subgoals to summarize policies and demonstrates the performance of our approach on standard benchmarks.

AB - As more and more people are expected to work with complex AI-systems, it becomes more important than ever that such systems provide intuitive explanations for their decisions. A prerequisite for holding such explanatory dialogue is the ability of the systems to present their proposed decisions to the user in an easy-to-understand form. Unfortunately, such dialogues could become hard to facilitate in real-world problems where the system may be planning for multiple eventualities in stochastic environments. This means for the system to be effective, it needs to be able to present the policy at a high-level of abstraction and delve into details as required. Towards this end, we investigate the utility of temporal abstractions derived through analytically computed landmarks and their relative ordering to build a summarization of policies for Stochastic Shortest Path Problems. We formalize the concept of policy landmarks and show how it can be used to provide a high level overview of a given policy. Additionally, we establish the connections between the type of hierarchy we generate and previous works in temporal abstractions, specifically MaxQ hierarchies. Our approach is evaluated through user studies as well as empirical metrics that establish that people tend to choose landmarks facts as subgoals to summarize policies and demonstrates the performance of our approach on standard benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=85088499000&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85088499000&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85088499000

SN - 2334-0835

VL - 30

SP - 272

EP - 280

JO - Proceedings International Conference on Automated Planning and Scheduling, ICAPS

JF - Proceedings International Conference on Automated Planning and Scheduling, ICAPS

Y2 - 26 October 2020 through 30 October 2020

ER -

TLdR: Policy summarization for factored SSP problems using temporal abstractions

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this