A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, Westley Weimer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

265 Citations (Scopus)

Abstract

There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: "What fraction of bugs can be repaired automatically?" and "How much does it cost to repair a bug automatically?" In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.

Original languageEnglish (US)
Title of host publicationProceedings - 34th International Conference on Software Engineering, ICSE 2012
Pages3-13
Number of pages11
DOIs
StatePublished - Jul 30 2012
Externally publishedYes
Event34th International Conference on Software Engineering, ICSE 2012 - Zurich, Switzerland
Duration: Jun 2 2012Jun 9 2012

Other

Other34th International Conference on Software Engineering, ICSE 2012
CountrySwitzerland
CityZurich
Period6/2/126/9/12

Fingerprint

Repair
Defects
Cloud computing
Costs
Genetic programming

Keywords

  • automated program repair
  • cloud computing
  • genetic programming

ASJC Scopus subject areas

  • Software

Cite this

Le Goues, C., Dewey-Vogt, M., Forrest, S., & Weimer, W. (2012). A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Proceedings - 34th International Conference on Software Engineering, ICSE 2012 (pp. 3-13). [6227211] https://doi.org/10.1109/ICSE.2012.6227211

A systematic study of automated program repair : Fixing 55 out of 105 bugs for $8 each. / Le Goues, Claire; Dewey-Vogt, Michael; Forrest, Stephanie; Weimer, Westley.

Proceedings - 34th International Conference on Software Engineering, ICSE 2012. 2012. p. 3-13 6227211.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Le Goues, C, Dewey-Vogt, M, Forrest, S & Weimer, W 2012, A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. in Proceedings - 34th International Conference on Software Engineering, ICSE 2012., 6227211, pp. 3-13, 34th International Conference on Software Engineering, ICSE 2012, Zurich, Switzerland, 6/2/12. https://doi.org/10.1109/ICSE.2012.6227211
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Proceedings - 34th International Conference on Software Engineering, ICSE 2012. 2012. p. 3-13. 6227211 https://doi.org/10.1109/ICSE.2012.6227211
Le Goues, Claire ; Dewey-Vogt, Michael ; Forrest, Stephanie ; Weimer, Westley. / A systematic study of automated program repair : Fixing 55 out of 105 bugs for $8 each. Proceedings - 34th International Conference on Software Engineering, ICSE 2012. 2012. pp. 3-13
@inproceedings{409a455acf1d4658ae839bcb581e0515,
title = "A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each",
abstract = "There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: {"}What fraction of bugs can be repaired automatically?{"} and {"}How much does it cost to repair a bug automatically?{"} In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68{\%} more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.",
keywords = "automated program repair, cloud computing, genetic programming",
author = "{Le Goues}, Claire and Michael Dewey-Vogt and Stephanie Forrest and Westley Weimer",
year = "2012",
month = "7",
day = "30",
doi = "10.1109/ICSE.2012.6227211",
language = "English (US)",
isbn = "9781467310673",
pages = "3--13",
booktitle = "Proceedings - 34th International Conference on Software Engineering, ICSE 2012",

}

TY - GEN

T1 - A systematic study of automated program repair

T2 - Fixing 55 out of 105 bugs for $8 each

AU - Le Goues, Claire

AU - Dewey-Vogt, Michael

AU - Forrest, Stephanie

AU - Weimer, Westley

PY - 2012/7/30

Y1 - 2012/7/30

N2 - There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: "What fraction of bugs can be repaired automatically?" and "How much does it cost to repair a bug automatically?" In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.

AB - There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: "What fraction of bugs can be repaired automatically?" and "How much does it cost to repair a bug automatically?" In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68% more often, (2) exploit GenProg's inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for $403; a successful repair completes in 96 minutes and costs $7.32, on average.

KW - automated program repair

KW - cloud computing

KW - genetic programming

UR - http://www.scopus.com/inward/record.url?scp=84864264923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864264923&partnerID=8YFLogxK

U2 - 10.1109/ICSE.2012.6227211

DO - 10.1109/ICSE.2012.6227211

M3 - Conference contribution

AN - SCOPUS:84864264923

SN - 9781467310673

SP - 3

EP - 13

BT - Proceedings - 34th International Conference on Software Engineering, ICSE 2012

ER -