A Compiler Technique for Processor-Wide Protection From Soft Errors in Multithreaded Environments

Moslem Didehban, Aviral Shrivastava

    Research output: Contribution to journalArticle

    Abstract

    Aggressive transistor scaling down and near-threshold computing have rendered modern microprocessor susceptible to soft errors. Software approaches that protect computations against soft errors are desirable because they offer flexible protection and are suitable for mixed-critical systems. In particular, fine-grain instruction duplication based techniques are deemed to be most effective; however, many of the existing instruction duplication techniques either suffer from many vulnerable intervals or are not suitable for multithreaded environments. In this paper, we present multithreded near zero silent data corruption (MZDC), a software scheme which provides high-level processor-wide error coverage in multithreaded environments. MZDC duplicates all programs&#x0027; instructions and uses diagnosis block after replicated memory operations to overcome the inconsistency issue in a multithread environment. Statistical fault injection experiments on a dual-core ARM cortex-A53 <formula><tex>$\mu$</tex></formula> architecturally simulated microprocessor show that on average, MZDC can achieve more than 37<formula><tex>$\times$</tex></formula> better fault coverage than the state-of-the-art.

    Original languageEnglish (US)
    JournalIEEE Transactions on Reliability
    DOIs
    StateAccepted/In press - Feb 9 2018

    Fingerprint

    Microprocessor chips
    Transistors
    Data storage equipment
    Experiments

    Keywords

    • Compiler transformation
    • Hardware
    • Instruction sets
    • Microprocessors
    • multithreading
    • Registers
    • reliability
    • soft errors
    • Transient analysis
    • transient faults

    ASJC Scopus subject areas

    • Safety, Risk, Reliability and Quality
    • Electrical and Electronic Engineering

    Cite this

    @article{6442a67289834867b87e151bfa77cf16,
    title = "A Compiler Technique for Processor-Wide Protection From Soft Errors in Multithreaded Environments",
    abstract = "Aggressive transistor scaling down and near-threshold computing have rendered modern microprocessor susceptible to soft errors. Software approaches that protect computations against soft errors are desirable because they offer flexible protection and are suitable for mixed-critical systems. In particular, fine-grain instruction duplication based techniques are deemed to be most effective; however, many of the existing instruction duplication techniques either suffer from many vulnerable intervals or are not suitable for multithreaded environments. In this paper, we present multithreded near zero silent data corruption (MZDC), a software scheme which provides high-level processor-wide error coverage in multithreaded environments. MZDC duplicates all programs' instructions and uses diagnosis block after replicated memory operations to overcome the inconsistency issue in a multithread environment. Statistical fault injection experiments on a dual-core ARM cortex-A53 $\mu$ architecturally simulated microprocessor show that on average, MZDC can achieve more than 37$\times$ better fault coverage than the state-of-the-art.",
    keywords = "Compiler transformation, Hardware, Instruction sets, Microprocessors, multithreading, Registers, reliability, soft errors, Transient analysis, transient faults",
    author = "Moslem Didehban and Aviral Shrivastava",
    year = "2018",
    month = "2",
    day = "9",
    doi = "10.1109/TR.2018.2793098",
    language = "English (US)",
    journal = "IEEE Transactions on Reliability",
    issn = "0018-9529",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - JOUR

    T1 - A Compiler Technique for Processor-Wide Protection From Soft Errors in Multithreaded Environments

    AU - Didehban, Moslem

    AU - Shrivastava, Aviral

    PY - 2018/2/9

    Y1 - 2018/2/9

    N2 - Aggressive transistor scaling down and near-threshold computing have rendered modern microprocessor susceptible to soft errors. Software approaches that protect computations against soft errors are desirable because they offer flexible protection and are suitable for mixed-critical systems. In particular, fine-grain instruction duplication based techniques are deemed to be most effective; however, many of the existing instruction duplication techniques either suffer from many vulnerable intervals or are not suitable for multithreaded environments. In this paper, we present multithreded near zero silent data corruption (MZDC), a software scheme which provides high-level processor-wide error coverage in multithreaded environments. MZDC duplicates all programs' instructions and uses diagnosis block after replicated memory operations to overcome the inconsistency issue in a multithread environment. Statistical fault injection experiments on a dual-core ARM cortex-A53 $\mu$ architecturally simulated microprocessor show that on average, MZDC can achieve more than 37$\times$ better fault coverage than the state-of-the-art.

    AB - Aggressive transistor scaling down and near-threshold computing have rendered modern microprocessor susceptible to soft errors. Software approaches that protect computations against soft errors are desirable because they offer flexible protection and are suitable for mixed-critical systems. In particular, fine-grain instruction duplication based techniques are deemed to be most effective; however, many of the existing instruction duplication techniques either suffer from many vulnerable intervals or are not suitable for multithreaded environments. In this paper, we present multithreded near zero silent data corruption (MZDC), a software scheme which provides high-level processor-wide error coverage in multithreaded environments. MZDC duplicates all programs' instructions and uses diagnosis block after replicated memory operations to overcome the inconsistency issue in a multithread environment. Statistical fault injection experiments on a dual-core ARM cortex-A53 $\mu$ architecturally simulated microprocessor show that on average, MZDC can achieve more than 37$\times$ better fault coverage than the state-of-the-art.

    KW - Compiler transformation

    KW - Hardware

    KW - Instruction sets

    KW - Microprocessors

    KW - multithreading

    KW - Registers

    KW - reliability

    KW - soft errors

    KW - Transient analysis

    KW - transient faults

    UR - http://www.scopus.com/inward/record.url?scp=85041823284&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85041823284&partnerID=8YFLogxK

    U2 - 10.1109/TR.2018.2793098

    DO - 10.1109/TR.2018.2793098

    M3 - Article

    JO - IEEE Transactions on Reliability

    JF - IEEE Transactions on Reliability

    SN - 0018-9529

    ER -