NEMESIS: A software approach for computing in presence of soft errors

Moslem Didehban, Aviral Shrivastava, Sai Ram Dheeraj Lokam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Scopus citations

Abstract

Soft errors are considered as the main reliability challenge for sub-nanoscale microprocessors. Software-level soft error resilience schemes are desirable because they require no hardware modifications and their protection can be tuned based on the application requirements. However, existing software-level error tolerant schemes do not provide high-level of protection. In this work, we present NEMESIS - a compiler-level fine-grain soft error detection, diagnosis and recovery technique that can provide high degree of error-resiliency. NEMESIS runs three versions of computations and detects soft errors by checking the results of all memory write and branch operations. In the case of mismatch, NEMESIS recovery routine reverts the effect of error from the architectural state of the program and program resumes its normal execution. Our extensive μ-architectural-level fault injection experiments results show that NEMESIS transformation is able to detect all soft errors and recover from 97% of detected errors.

Original languageEnglish (US)
Title of host publication2017 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages297-304
Number of pages8
ISBN (Electronic)9781538630938
DOIs
StatePublished - Dec 13 2017
Event36th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2017 - Irvine, United States
Duration: Nov 13 2017Nov 16 2017

Publication series

NameIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
Volume2017-November
ISSN (Print)1092-3152

Other

Other36th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2017
Country/TerritoryUnited States
CityIrvine
Period11/13/1711/16/17

Keywords

  • Compiler Optimization
  • Reliability
  • Silent Data Corruption
  • Soft Errors

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'NEMESIS: A software approach for computing in presence of soft errors'. Together they form a unique fingerprint.

Cite this