Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma

Federico Sanabria; Forest Baker; Howard Rachlin

doi:10.3758/bf03195994

Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma

Federico Sanabria, Forest Baker, Howard Rachlin

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

Each of four pigeons was exposed to a single random-ratio schedule of reinforcement in which the probability of reinforcement for a peck on either of two keys was 1/25. Reinforcer amounts were determined by an iterated prisoner's dilemma (IPD) matrix in which the "other player" (a computer) played tit-for-tat. One key served as the cooperation (C) key; the other served as the defection (D) key. If a peck was scheduled to be reinforced and the D-key was pecked, the immediate reinforcer of that peck was always higher than it would have been had the C-key been pecked. However, if the C-key was pecked and the following peck was scheduled to be reinforced, reinforcement amount for pecks on either key were higher than they would have been if the previous peck had been on the D-key. Although immediate reinforcement was always higher for D-pecks, the overall reinforcement rate increased linearly with the proportion of C-pecks. C-pecks thus constituted a form of self-control. All the pigeons initially defected with this procedure. However, when feedback signals were introduced that indicated which key had last been pecked, cooperation (relative rate of C-pecks)-hence, self-control-increased for all the pigeons.

Original language	English (US)
Pages (from-to)	318-331
Number of pages	14
Journal	Learning and Behavior
Volume	31
Issue number	4
DOIs	https://doi.org/10.3758/bf03195994
State	Published - Nov 2003
Externally published	Yes

ASJC Scopus subject areas

Experimental and Cognitive Psychology
Cognitive Neuroscience
Behavioral Neuroscience

Access to Document

10.3758/bf03195994

Cite this

@article{ea4cb655ea8a41bda31af7e2f2753ac5,

title = "Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma",

abstract = "Each of four pigeons was exposed to a single random-ratio schedule of reinforcement in which the probability of reinforcement for a peck on either of two keys was 1/25. Reinforcer amounts were determined by an iterated prisoner's dilemma (IPD) matrix in which the {"}other player{"} (a computer) played tit-for-tat. One key served as the cooperation (C) key; the other served as the defection (D) key. If a peck was scheduled to be reinforced and the D-key was pecked, the immediate reinforcer of that peck was always higher than it would have been had the C-key been pecked. However, if the C-key was pecked and the following peck was scheduled to be reinforced, reinforcement amount for pecks on either key were higher than they would have been if the previous peck had been on the D-key. Although immediate reinforcement was always higher for D-pecks, the overall reinforcement rate increased linearly with the proportion of C-pecks. C-pecks thus constituted a form of self-control. All the pigeons initially defected with this procedure. However, when feedback signals were introduced that indicated which key had last been pecked, cooperation (relative rate of C-pecks)-hence, self-control-increased for all the pigeons.",

author = "Federico Sanabria and Forest Baker and Howard Rachlin",

year = "2003",

month = nov,

doi = "10.3758/bf03195994",

language = "English (US)",

volume = "31",

pages = "318--331",

journal = "Learning and Behavior",

issn = "1543-4494",

publisher = "Springer New York",

number = "4",

}

TY - JOUR

T1 - Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma

AU - Sanabria, Federico

AU - Baker, Forest

AU - Rachlin, Howard

PY - 2003/11

Y1 - 2003/11

N2 - Each of four pigeons was exposed to a single random-ratio schedule of reinforcement in which the probability of reinforcement for a peck on either of two keys was 1/25. Reinforcer amounts were determined by an iterated prisoner's dilemma (IPD) matrix in which the "other player" (a computer) played tit-for-tat. One key served as the cooperation (C) key; the other served as the defection (D) key. If a peck was scheduled to be reinforced and the D-key was pecked, the immediate reinforcer of that peck was always higher than it would have been had the C-key been pecked. However, if the C-key was pecked and the following peck was scheduled to be reinforced, reinforcement amount for pecks on either key were higher than they would have been if the previous peck had been on the D-key. Although immediate reinforcement was always higher for D-pecks, the overall reinforcement rate increased linearly with the proportion of C-pecks. C-pecks thus constituted a form of self-control. All the pigeons initially defected with this procedure. However, when feedback signals were introduced that indicated which key had last been pecked, cooperation (relative rate of C-pecks)-hence, self-control-increased for all the pigeons.

AB - Each of four pigeons was exposed to a single random-ratio schedule of reinforcement in which the probability of reinforcement for a peck on either of two keys was 1/25. Reinforcer amounts were determined by an iterated prisoner's dilemma (IPD) matrix in which the "other player" (a computer) played tit-for-tat. One key served as the cooperation (C) key; the other served as the defection (D) key. If a peck was scheduled to be reinforced and the D-key was pecked, the immediate reinforcer of that peck was always higher than it would have been had the C-key been pecked. However, if the C-key was pecked and the following peck was scheduled to be reinforced, reinforcement amount for pecks on either key were higher than they would have been if the previous peck had been on the D-key. Although immediate reinforcement was always higher for D-pecks, the overall reinforcement rate increased linearly with the proportion of C-pecks. C-pecks thus constituted a form of self-control. All the pigeons initially defected with this procedure. However, when feedback signals were introduced that indicated which key had last been pecked, cooperation (relative rate of C-pecks)-hence, self-control-increased for all the pigeons.

UR - http://www.scopus.com/inward/record.url?scp=1442280641&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1442280641&partnerID=8YFLogxK

U2 - 10.3758/bf03195994

DO - 10.3758/bf03195994

M3 - Article

C2 - 14733481

AN - SCOPUS:1442280641

SN - 1543-4494

VL - 31

SP - 318

EP - 331

JO - Learning and Behavior

JF - Learning and Behavior

IS - 4

ER -

Learning by pigeons playing against tit-for-tat in an operant prisoner's dilemma

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this