Policy adjustment in a dynamic economic game

Jian Li, Samuel McClure, Brooks King-Casas, P. Read Montague

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.

Original languageEnglish (US)
Article numbere103
JournalPLoS One
Volume1
Issue number1
DOIs
StatePublished - Dec 20 2006
Externally publishedYes

Fingerprint

Social Adjustment
Reward
Brain
History
Economics
economics
history
learning
Functional neuroimaging
brain
animal preferences
Prefrontal Cortex
conditioned behavior
Animals
Decision making
Learning
decision making
Decision Making
prediction
Corpus Striatum

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Policy adjustment in a dynamic economic game. / Li, Jian; McClure, Samuel; King-Casas, Brooks; Montague, P. Read.

In: PLoS One, Vol. 1, No. 1, e103, 20.12.2006.

Research output: Contribution to journalArticle

Li, Jian ; McClure, Samuel ; King-Casas, Brooks ; Montague, P. Read. / Policy adjustment in a dynamic economic game. In: PLoS One. 2006 ; Vol. 1, No. 1.
@article{0bdd06d1fd26445284af1da2ab50e88c,
title = "Policy adjustment in a dynamic economic game",
abstract = "Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.",
author = "Jian Li and Samuel McClure and Brooks King-Casas and Montague, {P. Read}",
year = "2006",
month = "12",
day = "20",
doi = "10.1371/journal.pone.0000103",
language = "English (US)",
volume = "1",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "1",

}

TY - JOUR

T1 - Policy adjustment in a dynamic economic game

AU - Li, Jian

AU - McClure, Samuel

AU - King-Casas, Brooks

AU - Montague, P. Read

PY - 2006/12/20

Y1 - 2006/12/20

N2 - Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.

AB - Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.

UR - http://www.scopus.com/inward/record.url?scp=54949094339&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=54949094339&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0000103

DO - 10.1371/journal.pone.0000103

M3 - Article

C2 - 17183636

AN - SCOPUS:54949094339

VL - 1

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 1

M1 - e103

ER -