REPT: Reverse debugging of failures in deployed software

Weidong Cui, Xinyang Ge, Baris Kasikci, Ben Niu, Upamanyu Sharma, Ruoyu Wang, Insu Yun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

52 Scopus citations

Abstract

Debugging software failures in deployed systems is important because they impact real users and customers. However, debugging such failures is notoriously hard in practice because developers have to rely on limited information such as memory dumps. The execution history is usually unavailable because high-fidelity program tracing is not affordable in deployed systems. In this paper, we present REPT, a practical system that enables reverse debugging of software failures in deployed systems. REPT reconstructs the execution history with high fidelity by combining online lightweight hardware tracing of a program's control flow with offline binary analysis that recovers its data flow. It is seemingly impossible to recover data values thousands of instructions before the failure due to information loss and concurrent execution. REPT tackles these challenges by constructing a partial execution order based on timestamps logged by hardware and iteratively performing forward and backward execution with error correction. We design and implement REPT, deploy it on Microsoft Windows, and integrate it into WinDbg. We evaluate REPT on 16 real-world bugs and show that it can recover data values accurately (92% on average) and efficiently (in less than 20 seconds) for these bugs. We also show that it enables effective reverse debugging for 14 bugs.

Original languageEnglish (US)
Title of host publicationProceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
PublisherUSENIX Association
Pages17-32
Number of pages16
ISBN (Electronic)9781939133083
StatePublished - 2007
Event13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018 - Carlsbad, United States
Duration: Oct 8 2018Oct 10 2018

Publication series

NameProceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018

Conference

Conference13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
Country/TerritoryUnited States
CityCarlsbad
Period10/8/1810/10/18

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'REPT: Reverse debugging of failures in deployed software'. Together they form a unique fingerprint.

Cite this