TY - GEN
T1 - The Convergence of Source Code and Binary Vulnerability Discovery - A Case Study
AU - Mantovani, Alessandro
AU - Compagna, Luca
AU - Shoshitaishvili, Yan
AU - Balzarotti, Davide
N1 - Funding Information:
This research was partially supported by the Defense Advanced Research ProjectsAgency (DARPA) under grant agreements FA875019C0003 and N6600120C4020.
Funding Information:
This research was partially supported by the Defense Advanced Research Projects Agency (DARPA) under grant agreements FA875019C0003 and N6600120C4020.
Publisher Copyright:
© 2022 ACM.
PY - 2022/5/30
Y1 - 2022/5/30
N2 - Decompilers are tools designed to recover a high-level language representation (typically in C code) from program binaries. Over the past five years, decompilers have improved enormously, not only in terms of the readability of the produced pseudocode, but also in terms of similarity of the recovered representation to the original source code. Albeit decompilers are routinely used by reverse engineers in different disciplines (e.g., to support vulnerability discovery or malware analysis), they are not yet adopted to produce input for source-code static analysis tools. In particular, source code vulnerability discovery and binary vulnerability discovery remain today two very different areas of research, despite the fact that decompilers could potentially bridge this gap and enable source-code analysis on binary files. In this paper, we conducted a number of experiments on real world vulnerabilities to evaluate the feasibility of this approach. In particular, our measurements are intended to show how the differences between original and decompiled code impact the accuracy of static analysis tools. Remarkably, our results show that in 71% of the cases, the same vulnerabilities can be detected by running the static analyzers on the decompiled code, even though for several cases we observe a steep increment in the number of false positives. To understand the reasons behind these differences, we manually investigated all cases and we identified a number of root causes that affected the ability of static tools to 'understand' the generated code.
AB - Decompilers are tools designed to recover a high-level language representation (typically in C code) from program binaries. Over the past five years, decompilers have improved enormously, not only in terms of the readability of the produced pseudocode, but also in terms of similarity of the recovered representation to the original source code. Albeit decompilers are routinely used by reverse engineers in different disciplines (e.g., to support vulnerability discovery or malware analysis), they are not yet adopted to produce input for source-code static analysis tools. In particular, source code vulnerability discovery and binary vulnerability discovery remain today two very different areas of research, despite the fact that decompilers could potentially bridge this gap and enable source-code analysis on binary files. In this paper, we conducted a number of experiments on real world vulnerabilities to evaluate the feasibility of this approach. In particular, our measurements are intended to show how the differences between original and decompiled code impact the accuracy of static analysis tools. Remarkably, our results show that in 71% of the cases, the same vulnerabilities can be detected by running the static analyzers on the decompiled code, even though for several cases we observe a steep increment in the number of false positives. To understand the reasons behind these differences, we manually investigated all cases and we identified a number of root causes that affected the ability of static tools to 'understand' the generated code.
KW - decompiler
KW - reversing
KW - sast
KW - vulnerability
UR - http://www.scopus.com/inward/record.url?scp=85133167123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133167123&partnerID=8YFLogxK
U2 - 10.1145/3488932.3497764
DO - 10.1145/3488932.3497764
M3 - Conference contribution
AN - SCOPUS:85133167123
T3 - ASIA CCS 2022 - Proceedings of the 2022 ACM Asia Conference on Computer and Communications Security
SP - 602
EP - 615
BT - ASIA CCS 2022 - Proceedings of the 2022 ACM Asia Conference on Computer and Communications Security
PB - Association for Computing Machinery, Inc
T2 - 17th ACM ASIA Conference on Computer and Communications Security 2022, ASIA CCS 2022
Y2 - 30 May 2022 through 3 June 2022
ER -