A genetic programming approach to automated software repair

Stephanie Forrest, Thanhvu Nguyen, Westley Weimer, Claire Le Goues

Research output: Chapter in Book/Report/Conference proceedingConference contribution

119 Citations (Scopus)

Abstract

Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

Original languageEnglish (US)
Title of host publicationProceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009
Pages947-954
Number of pages8
DOIs
StatePublished - Dec 31 2009
Externally publishedYes
Event11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 - Montreal, QC, Canada
Duration: Jul 8 2009Jul 12 2009

Other

Other11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009
CountryCanada
CityMontreal, QC
Period7/8/097/12/09

Fingerprint

Genetic programming
Genetic Programming
Repair
Software
Genetic Operators
Program Analysis
Debugging
Vertex of a graph
Experiments
Exercise
Fitness
Experiment
Efficacy
Minimise
Path
Line
Requirements

Keywords

  • Genetic programming
  • Software engineering
  • Software repair

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Theoretical Computer Science

Cite this

Forrest, S., Nguyen, T., Weimer, W., & Le Goues, C. (2009). A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 (pp. 947-954) https://doi.org/10.1145/1569901.1570031

A genetic programming approach to automated software repair. / Forrest, Stephanie; Nguyen, Thanhvu; Weimer, Westley; Le Goues, Claire.

Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009. 2009. p. 947-954.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Forrest, S, Nguyen, T, Weimer, W & Le Goues, C 2009, A genetic programming approach to automated software repair. in Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009. pp. 947-954, 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009, Montreal, QC, Canada, 7/8/09. https://doi.org/10.1145/1569901.1570031
Forrest S, Nguyen T, Weimer W, Le Goues C. A genetic programming approach to automated software repair. In Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009. 2009. p. 947-954 https://doi.org/10.1145/1569901.1570031
Forrest, Stephanie ; Nguyen, Thanhvu ; Weimer, Westley ; Le Goues, Claire. / A genetic programming approach to automated software repair. Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009. 2009. pp. 947-954
@inproceedings{f14acd6c10184fdbbcded5da58e599e6,
title = "A genetic programming approach to automated software repair",
abstract = "Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.",
keywords = "Genetic programming, Software engineering, Software repair",
author = "Stephanie Forrest and Thanhvu Nguyen and Westley Weimer and {Le Goues}, Claire",
year = "2009",
month = "12",
day = "31",
doi = "10.1145/1569901.1570031",
language = "English (US)",
isbn = "9781605583259",
pages = "947--954",
booktitle = "Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009",

}

TY - GEN

T1 - A genetic programming approach to automated software repair

AU - Forrest, Stephanie

AU - Nguyen, Thanhvu

AU - Weimer, Westley

AU - Le Goues, Claire

PY - 2009/12/31

Y1 - 2009/12/31

N2 - Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

AB - Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

KW - Genetic programming

KW - Software engineering

KW - Software repair

UR - http://www.scopus.com/inward/record.url?scp=72749113538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72749113538&partnerID=8YFLogxK

U2 - 10.1145/1569901.1570031

DO - 10.1145/1569901.1570031

M3 - Conference contribution

AN - SCOPUS:72749113538

SN - 9781605583259

SP - 947

EP - 954

BT - Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009

ER -