T-Fuzz: Fuzzing by Program Transformation

Hui Peng; Yan Shoshitaishvili; Mathias Payer

doi:10.1109/SP.2018.00056

T-Fuzz: Fuzzing by Program Transformation

Hui Peng, Yan Shoshitaishvili, Mathias Payer

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

225 Scopus citations

Abstract

Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

Original language	English (US)
Title of host publication	Proceedings - 2018 IEEE Symposium on Security and Privacy, SP 2018
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	697-710
Number of pages	14
ISBN (Electronic)	9781538643525
DOIs	https://doi.org/10.1109/SP.2018.00056
State	Published - Jul 23 2018
Event	39th IEEE Symposium on Security and Privacy, SP 2018 - San Francisco, United States Duration: May 21 2018 → May 23 2018

Publication series

Name	Proceedings - IEEE Symposium on Security and Privacy
Volume	2018-May
ISSN (Print)	1081-6011

Other

Other	39th IEEE Symposium on Security and Privacy, SP 2018
Country/Territory	United States
City	San Francisco
Period	5/21/18 → 5/23/18

Keywords

Bug finding
Fuzz
Program Analysis

ASJC Scopus subject areas

Safety, Risk, Reliability and Quality
Software
Computer Networks and Communications

Access to Document

10.1109/SP.2018.00056

Cite this

T-Fuzz: Fuzzing by Program Transformation. / Peng, Hui; Shoshitaishvili, Yan; Payer, Mathias.
Proceedings - 2018 IEEE Symposium on Security and Privacy, SP 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 697-710 8418632 (Proceedings - IEEE Symposium on Security and Privacy; Vol. 2018-May).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Peng, H, Shoshitaishvili, Y & Payer, M 2018, T-Fuzz: Fuzzing by Program Transformation. in Proceedings - 2018 IEEE Symposium on Security and Privacy, SP 2018., 8418632, Proceedings - IEEE Symposium on Security and Privacy, vol. 2018-May, Institute of Electrical and Electronics Engineers Inc., pp. 697-710, 39th IEEE Symposium on Security and Privacy, SP 2018, San Francisco, United States, 5/21/18. https://doi.org/10.1109/SP.2018.00056

@inproceedings{1d4739b63f964f0d9d3ec9882892a997,

title = "T-Fuzz: Fuzzing by Program Transformation",

abstract = "Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.",

keywords = "Bug finding, Fuzz, Program Analysis",

author = "Hui Peng and Yan Shoshitaishvili and Mathias Payer",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 39th IEEE Symposium on Security and Privacy, SP 2018 ; Conference date: 21-05-2018 Through 23-05-2018",

year = "2018",

month = jul,

day = "23",

doi = "10.1109/SP.2018.00056",

language = "English (US)",

series = "Proceedings - IEEE Symposium on Security and Privacy",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "697--710",

booktitle = "Proceedings - 2018 IEEE Symposium on Security and Privacy, SP 2018",

}

TY - GEN

T1 - T-Fuzz

T2 - 39th IEEE Symposium on Security and Privacy, SP 2018

AU - Peng, Hui

AU - Shoshitaishvili, Yan

AU - Payer, Mathias

PY - 2018/7/23

Y1 - 2018/7/23

N2 - Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

AB - Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

KW - Bug finding

KW - Fuzz

KW - Program Analysis

UR - http://www.scopus.com/inward/record.url?scp=85051011382&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051011382&partnerID=8YFLogxK

U2 - 10.1109/SP.2018.00056

DO - 10.1109/SP.2018.00056

M3 - Conference contribution

AN - SCOPUS:85051011382

T3 - Proceedings - IEEE Symposium on Security and Privacy

SP - 697

EP - 710

BT - Proceedings - 2018 IEEE Symposium on Security and Privacy, SP 2018

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 21 May 2018 through 23 May 2018

ER -

T-Fuzz: Fuzzing by Program Transformation

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this