UnSync-CMP

Multicore CMP architecture for energy-efficient soft-error reliability

Reiley Jeyapaul, Fei Hong, Abhishek Rhisheekesan, Aviral Shrivastava, Kyoungwoo Lee

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are only consequences of the inevitable advancement in processor technology, the industry has been forced to improve reliability on general purpose chip multiprocessors (CMPs). With the availability of increased hardware resources, redundancy-based techniques are the most promising methods to eradicate soft-error failures in CMP systems. In this work, we propose a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware-based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error-free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of our architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability tradeoffs in many-core systems. We provide a redundancy-based soft-error resilient CMP architecture for both write-through and write-back cache configurations. We design a detailed RTL model of our UnSync architecture and perform hardware synthesis to compare the hardware (power/area) overheads incurred. We compare the same with those of the Reunion technique, a state-of-the-art redundant multicore architecture. We also perform cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, our UnSync architecture reduces power consumption by 34.5 percent and improves performance by up to 20 percent with 13.3 percent less area overhead, when compared to the Reunion architecture for the same level of reliability achieved.

Original languageEnglish (US)
Article number6410312
Pages (from-to)254-263
Number of pages10
JournalIEEE Transactions on Parallel and Distributed Systems
Volume25
Issue number1
DOIs
StatePublished - Jan 1 2014

Fingerprint

Redundancy
Hardware
Transistors
Electric power utilization
Availability
Recovery
Industry

Keywords

  • CMP
  • Multicore architecture
  • power efficiency
  • reliability
  • soft error

ASJC Scopus subject areas

  • Hardware and Architecture
  • Signal Processing
  • Computational Theory and Mathematics

Cite this

UnSync-CMP : Multicore CMP architecture for energy-efficient soft-error reliability. / Jeyapaul, Reiley; Hong, Fei; Rhisheekesan, Abhishek; Shrivastava, Aviral; Lee, Kyoungwoo.

In: IEEE Transactions on Parallel and Distributed Systems, Vol. 25, No. 1, 6410312, 01.01.2014, p. 254-263.

Research output: Contribution to journalArticle

Jeyapaul, Reiley ; Hong, Fei ; Rhisheekesan, Abhishek ; Shrivastava, Aviral ; Lee, Kyoungwoo. / UnSync-CMP : Multicore CMP architecture for energy-efficient soft-error reliability. In: IEEE Transactions on Parallel and Distributed Systems. 2014 ; Vol. 25, No. 1. pp. 254-263.
@article{614d780c865445538f0e77fbd5c335bf,
title = "UnSync-CMP: Multicore CMP architecture for energy-efficient soft-error reliability",
abstract = "Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are only consequences of the inevitable advancement in processor technology, the industry has been forced to improve reliability on general purpose chip multiprocessors (CMPs). With the availability of increased hardware resources, redundancy-based techniques are the most promising methods to eradicate soft-error failures in CMP systems. In this work, we propose a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware-based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error-free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of our architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability tradeoffs in many-core systems. We provide a redundancy-based soft-error resilient CMP architecture for both write-through and write-back cache configurations. We design a detailed RTL model of our UnSync architecture and perform hardware synthesis to compare the hardware (power/area) overheads incurred. We compare the same with those of the Reunion technique, a state-of-the-art redundant multicore architecture. We also perform cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, our UnSync architecture reduces power consumption by 34.5 percent and improves performance by up to 20 percent with 13.3 percent less area overhead, when compared to the Reunion architecture for the same level of reliability achieved.",
keywords = "CMP, Multicore architecture, power efficiency, reliability, soft error",
author = "Reiley Jeyapaul and Fei Hong and Abhishek Rhisheekesan and Aviral Shrivastava and Kyoungwoo Lee",
year = "2014",
month = "1",
day = "1",
doi = "10.1109/TPDS.2013.14",
language = "English (US)",
volume = "25",
pages = "254--263",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "1",

}

TY - JOUR

T1 - UnSync-CMP

T2 - Multicore CMP architecture for energy-efficient soft-error reliability

AU - Jeyapaul, Reiley

AU - Hong, Fei

AU - Rhisheekesan, Abhishek

AU - Shrivastava, Aviral

AU - Lee, Kyoungwoo

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are only consequences of the inevitable advancement in processor technology, the industry has been forced to improve reliability on general purpose chip multiprocessors (CMPs). With the availability of increased hardware resources, redundancy-based techniques are the most promising methods to eradicate soft-error failures in CMP systems. In this work, we propose a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware-based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error-free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of our architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability tradeoffs in many-core systems. We provide a redundancy-based soft-error resilient CMP architecture for both write-through and write-back cache configurations. We design a detailed RTL model of our UnSync architecture and perform hardware synthesis to compare the hardware (power/area) overheads incurred. We compare the same with those of the Reunion technique, a state-of-the-art redundant multicore architecture. We also perform cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, our UnSync architecture reduces power consumption by 34.5 percent and improves performance by up to 20 percent with 13.3 percent less area overhead, when compared to the Reunion architecture for the same level of reliability achieved.

AB - Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are only consequences of the inevitable advancement in processor technology, the industry has been forced to improve reliability on general purpose chip multiprocessors (CMPs). With the availability of increased hardware resources, redundancy-based techniques are the most promising methods to eradicate soft-error failures in CMP systems. In this work, we propose a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware-based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error-free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of our architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability tradeoffs in many-core systems. We provide a redundancy-based soft-error resilient CMP architecture for both write-through and write-back cache configurations. We design a detailed RTL model of our UnSync architecture and perform hardware synthesis to compare the hardware (power/area) overheads incurred. We compare the same with those of the Reunion technique, a state-of-the-art redundant multicore architecture. We also perform cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, our UnSync architecture reduces power consumption by 34.5 percent and improves performance by up to 20 percent with 13.3 percent less area overhead, when compared to the Reunion architecture for the same level of reliability achieved.

KW - CMP

KW - Multicore architecture

KW - power efficiency

KW - reliability

KW - soft error

UR - http://www.scopus.com/inward/record.url?scp=84930653090&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84930653090&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2013.14

DO - 10.1109/TPDS.2013.14

M3 - Article

VL - 25

SP - 254

EP - 263

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 1

M1 - 6410312

ER -