Accurate Inference with Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Gouranga Charan; Abinash Mohanty; Xiaocong Du; Gokul Krishnan; Rajiv V. Joshi; Yu Cao

doi:10.1109/JXCDC.2020.2987605

Accurate Inference with Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Gouranga Charan, Abinash Mohanty, Xiaocong Du, Gokul Krishnan, Rajiv V. Joshi, Yu Cao

Research output: Contribution to journal › Article › peer-review

24 Scopus citations

Abstract

Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read-verify-write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15-150× speedup compared with R-V-W.

Original language	English (US)
Article number	9069242
Pages (from-to)	27-35
Number of pages	9
Journal	IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
Volume	6
Issue number	1
DOIs	https://doi.org/10.1109/JXCDC.2020.2987605
State	Published - Jun 2020

Keywords

Convolution neural networks
device nonidealities
model robustness
neuromorphic computing
random sparse adaptation (RSA)
resistive random access memory (RRAM)

ASJC Scopus subject areas

Electronic, Optical and Magnetic Materials
Hardware and Architecture
Electrical and Electronic Engineering

Access to Document

10.1109/JXCDC.2020.2987605

Cite this

@article{a5c09528bba844e59a0f711c0534b9bb,

title = "Accurate Inference with Inaccurate RRAM Devices: A Joint Algorithm-Design Solution",

abstract = "Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read-verify-write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15-150× speedup compared with R-V-W.",

keywords = "Convolution neural networks, device nonidealities, model robustness, neuromorphic computing, random sparse adaptation (RSA), resistive random access memory (RRAM)",

author = "Gouranga Charan and Abinash Mohanty and Xiaocong Du and Gokul Krishnan and Joshi, {Rajiv V.} and Yu Cao",

note = "Funding Information: 1School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85287 USA 2IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA CORRESPONDING AUTHORS: G. CHARAN (gcharang@asu.edu) and Y. CAO (ycao@asu.edu) This work was supported in part by the Center for Brain-Inspired Computing (C-BRIC), one of six centers in JUMP, in part by the Semiconductor Research Corporation (SRC) program sponsored by DARPA, and in part by the National Science Foundation (CCF #1715443). Publisher Copyright: {\textcopyright} 2014 IEEE.",

year = "2020",

month = jun,

doi = "10.1109/JXCDC.2020.2987605",

language = "English (US)",

volume = "6",

pages = "27--35",

journal = "IEEE Journal on Exploratory Solid-State Computational Devices and Circuits",

issn = "2329-9231",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "1",

}

TY - JOUR

T1 - Accurate Inference with Inaccurate RRAM Devices

T2 - A Joint Algorithm-Design Solution

AU - Charan, Gouranga

AU - Mohanty, Abinash

AU - Du, Xiaocong

AU - Krishnan, Gokul

AU - Joshi, Rajiv V.

AU - Cao, Yu

N1 - Funding Information: 1School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85287 USA 2IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA CORRESPONDING AUTHORS: G. CHARAN (gcharang@asu.edu) and Y. CAO (ycao@asu.edu) This work was supported in part by the Center for Brain-Inspired Computing (C-BRIC), one of six centers in JUMP, in part by the Semiconductor Research Corporation (SRC) program sponsored by DARPA, and in part by the National Science Foundation (CCF #1715443). Publisher Copyright: © 2014 IEEE.

PY - 2020/6

Y1 - 2020/6

N2 - Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read-verify-write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15-150× speedup compared with R-V-W.

AB - Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read-verify-write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15-150× speedup compared with R-V-W.

KW - Convolution neural networks

KW - device nonidealities

KW - model robustness

KW - neuromorphic computing

KW - random sparse adaptation (RSA)

KW - resistive random access memory (RRAM)

UR - http://www.scopus.com/inward/record.url?scp=85083714126&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85083714126&partnerID=8YFLogxK

U2 - 10.1109/JXCDC.2020.2987605

DO - 10.1109/JXCDC.2020.2987605

M3 - Article

AN - SCOPUS:85083714126

SN - 2329-9231

VL - 6

SP - 27

EP - 35

JO - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

JF - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits

IS - 1

M1 - 9069242

ER -

Accurate Inference with Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this