Understanding automatically-generated patches through symbolic invariant differences

Padraic Cashin; Carianne Martinez; Westley Weimer; Stephanie Forrest

doi:10.1109/ASE.2019.00046

Understanding automatically-generated patches through symbolic invariant differences

Padraic Cashin, Carianne Martinez, Westley Weimer, Stephanie Forrest

Biocomputing, Security and Society, Center for (BSS)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

11 Scopus citations

Abstract

Developer trust is a major barrier to the deployment of automatically-generated patches. Understanding the effect of a patch is a key element of that trust. We find that differences in sets of formal invariants characterize patch differences and that implication-based distances in invariant space characterize patch similarities. When one patch is similar to another it often contains the same changes as well as additional behavior; this pattern is well-captured by logical implication. We can measure differences using a theorem prover to verify implications between invariants implied by separate programs. Although effective, theorem provers are computationally intensive; we find that string distance is an efficient heuristic for implication-based distance measurements. We propose to use distances between patches to construct a hierarchy highlighting patch similarities. We evaluated this approach on over 300 patches and found that it correctly categorizes programs into semantically similar clusters. Clustering programs reduces human effort by reducing the number of semantically distinct patches that must be considered by over 50%, thus reducing the time required to establish trust in automatically generated repairs.

Original language	English (US)
Title of host publication	Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	411-414
Number of pages	4
ISBN (Electronic)	9781728125084
DOIs	https://doi.org/10.1109/ASE.2019.00046
State	Published - Nov 2019
Event	34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019 - San Diego, United States Duration: Nov 10 2019 → Nov 15 2019

Publication series

Name	Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019

Conference

Conference	34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Country/Territory	United States
City	San Diego
Period	11/10/19 → 11/15/19

Keywords

Automated Program Repair
Dynamic Invariants
Program Measurement

ASJC Scopus subject areas

Computer Networks and Communications
Software
Control and Optimization

Access to Document

10.1109/ASE.2019.00046

Cite this

Cashin, P., Martinez, C., Weimer, W., & Forrest, S. (2019). Understanding automatically-generated patches through symbolic invariant differences. In Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019 (pp. 411-414). Article 8952219 (Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASE.2019.00046

Understanding automatically-generated patches through symbolic invariant differences. / Cashin, Padraic; Martinez, Carianne; Weimer, Westley et al.
Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 411-414 8952219 (Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Cashin, P, Martinez, C, Weimer, W & Forrest, S 2019, Understanding automatically-generated patches through symbolic invariant differences. in Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019., 8952219, Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, Institute of Electrical and Electronics Engineers Inc., pp. 411-414, 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, United States, 11/10/19. https://doi.org/10.1109/ASE.2019.00046

Cashin P, Martinez C, Weimer W, Forrest S. Understanding automatically-generated patches through symbolic invariant differences. In Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 411-414. 8952219. (Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019). doi: 10.1109/ASE.2019.00046

Cashin, Padraic ; Martinez, Carianne ; Weimer, Westley et al. / Understanding automatically-generated patches through symbolic invariant differences. Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 411-414 (Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019).

@inproceedings{1d2cce092ffa4d68adc6297be8e8e4aa,

title = "Understanding automatically-generated patches through symbolic invariant differences",

abstract = "Developer trust is a major barrier to the deployment of automatically-generated patches. Understanding the effect of a patch is a key element of that trust. We find that differences in sets of formal invariants characterize patch differences and that implication-based distances in invariant space characterize patch similarities. When one patch is similar to another it often contains the same changes as well as additional behavior; this pattern is well-captured by logical implication. We can measure differences using a theorem prover to verify implications between invariants implied by separate programs. Although effective, theorem provers are computationally intensive; we find that string distance is an efficient heuristic for implication-based distance measurements. We propose to use distances between patches to construct a hierarchy highlighting patch similarities. We evaluated this approach on over 300 patches and found that it correctly categorizes programs into semantically similar clusters. Clustering programs reduces human effort by reducing the number of semantically distinct patches that must be considered by over 50%, thus reducing the time required to establish trust in automatically generated repairs.",

keywords = "Automated Program Repair, Dynamic Invariants, Program Measurement",

author = "Padraic Cashin and Carianne Martinez and Westley Weimer and Stephanie Forrest",

year = "2019",

month = nov,

doi = "10.1109/ASE.2019.00046",

language = "English (US)",

series = "Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "411--414",

booktitle = "Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019",

note = "34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019 ; Conference date: 10-11-2019 Through 15-11-2019",

}

TY - GEN

T1 - Understanding automatically-generated patches through symbolic invariant differences

AU - Cashin, Padraic

AU - Martinez, Carianne

AU - Weimer, Westley

AU - Forrest, Stephanie

PY - 2019/11

Y1 - 2019/11

N2 - Developer trust is a major barrier to the deployment of automatically-generated patches. Understanding the effect of a patch is a key element of that trust. We find that differences in sets of formal invariants characterize patch differences and that implication-based distances in invariant space characterize patch similarities. When one patch is similar to another it often contains the same changes as well as additional behavior; this pattern is well-captured by logical implication. We can measure differences using a theorem prover to verify implications between invariants implied by separate programs. Although effective, theorem provers are computationally intensive; we find that string distance is an efficient heuristic for implication-based distance measurements. We propose to use distances between patches to construct a hierarchy highlighting patch similarities. We evaluated this approach on over 300 patches and found that it correctly categorizes programs into semantically similar clusters. Clustering programs reduces human effort by reducing the number of semantically distinct patches that must be considered by over 50%, thus reducing the time required to establish trust in automatically generated repairs.

AB - Developer trust is a major barrier to the deployment of automatically-generated patches. Understanding the effect of a patch is a key element of that trust. We find that differences in sets of formal invariants characterize patch differences and that implication-based distances in invariant space characterize patch similarities. When one patch is similar to another it often contains the same changes as well as additional behavior; this pattern is well-captured by logical implication. We can measure differences using a theorem prover to verify implications between invariants implied by separate programs. Although effective, theorem provers are computationally intensive; we find that string distance is an efficient heuristic for implication-based distance measurements. We propose to use distances between patches to construct a hierarchy highlighting patch similarities. We evaluated this approach on over 300 patches and found that it correctly categorizes programs into semantically similar clusters. Clustering programs reduces human effort by reducing the number of semantically distinct patches that must be considered by over 50%, thus reducing the time required to establish trust in automatically generated repairs.

KW - Automated Program Repair

KW - Dynamic Invariants

KW - Program Measurement

UR - http://www.scopus.com/inward/record.url?scp=85078876646&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85078876646&partnerID=8YFLogxK

U2 - 10.1109/ASE.2019.00046

DO - 10.1109/ASE.2019.00046

M3 - Conference contribution

T3 - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019

SP - 411

EP - 414

BT - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019

Y2 - 10 November 2019 through 15 November 2019

ER -

Understanding automatically-generated patches through symbolic invariant differences

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this