Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma

Federico Sanabria, Forest Baker, Howard Rachlin

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

Each of four pigeons was exposed to a single random-ratio schedule of reinforcement in which the probability of reinforcement for a peck on either of two keys was 1/25. Reinforcer amounts were determined by an iterated prisoner's dilemma (IPD) matrix in which the "other player" (a computer) played tit-for-tat. One key served as the cooperation (C) key; the other served as the defection (D) key. If a peck was scheduled to be reinforced and the D-key was pecked, the immediate reinforcer of that peck was always higher than it would have been had the C-key been pecked. However, if the C-key was pecked and the following peck was scheduled to be reinforced, reinforcement amount for pecks on either key were higher than they would have been if the previous peck had been on the D-key. Although immediate reinforcement was always higher for D-pecks, the overall reinforcement rate increased linearly with the proportion of C-pecks. C-pecks thus constituted a form of self-control. All the pigeons initially defected with this procedure. However, when feedback signals were introduced that indicated which key had last been pecked, cooperation (relative rate of C-pecks)-hence, self-control-increased for all the pigeons.

Original languageEnglish (US)
Pages (from-to)318-331
Number of pages14
JournalLearning and Behavior
Volume31
Issue number4
DOIs
StatePublished - Nov 2003
Externally publishedYes

ASJC Scopus subject areas

  • Experimental and Cognitive Psychology
  • Cognitive Neuroscience
  • Behavioral Neuroscience

Fingerprint

Dive into the research topics of 'Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma'. Together they form a unique fingerprint.

Cite this