Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays

Abinash Mohanty; Xiaocong Du; Pai Yu Chen; Jae-sun Seo; Shimeng Yu; Yu Cao

doi:10.1109/IEDM.2017.8268339

Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays

Abinash Mohanty, Xiaocong Du, Pai Yu Chen, Jae-sun Seo, Shimeng Yu, Yu Cao

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

15 Scopus citations

Abstract

An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to onchip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, -5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the backpropagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10-100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.

Original language	English (US)
Title of host publication	2017 IEEE International Electron Devices Meeting, IEDM 2017
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	6.3.1-6.3.4
Volume	Part F134366
ISBN (Electronic)	9781538635599
DOIs	https://doi.org/10.1109/IEDM.2017.8268339
State	Published - Jan 23 2018
Event	63rd IEEE International Electron Devices Meeting, IEDM 2017 - San Francisco, United States Duration: Dec 2 2017 → Dec 6 2017

Other

Other	63rd IEEE International Electron Devices Meeting, IEDM 2017
Country/Territory	United States
City	San Francisco
Period	12/2/17 → 12/6/17

ASJC Scopus subject areas

Electronic, Optical and Magnetic Materials
Condensed Matter Physics
Electrical and Electronic Engineering
Materials Chemistry

Access to Document

10.1109/IEDM.2017.8268339

Cite this

Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays. / Mohanty, Abinash; Du, Xiaocong; Chen, Pai Yu et al.
2017 IEEE International Electron Devices Meeting, IEDM 2017. Vol. Part F134366 Institute of Electrical and Electronics Engineers Inc., 2018. p. 6.3.1-6.3.4.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Mohanty, A, Du, X, Chen, PY, Seo, J, Yu, S & Cao, Y 2018, Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays. in 2017 IEEE International Electron Devices Meeting, IEDM 2017. vol. Part F134366, Institute of Electrical and Electronics Engineers Inc., pp. 6.3.1-6.3.4, 63rd IEEE International Electron Devices Meeting, IEDM 2017, San Francisco, United States, 12/2/17. https://doi.org/10.1109/IEDM.2017.8268339

@inproceedings{4d354f76284445fa9a506f6e5016b0b5,

title = "Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays",

abstract = "An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to onchip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, -5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the backpropagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10-100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.",

author = "Abinash Mohanty and Xiaocong Du and Chen, {Pai Yu} and Jae-sun Seo and Shimeng Yu and Yu Cao",

year = "2018",

month = jan,

day = "23",

doi = "10.1109/IEDM.2017.8268339",

language = "English (US)",

volume = "Part F134366",

pages = "6.3.1--6.3.4",

booktitle = "2017 IEEE International Electron Devices Meeting, IEDM 2017",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

note = "63rd IEEE International Electron Devices Meeting, IEDM 2017 ; Conference date: 02-12-2017 Through 06-12-2017",

}

TY - GEN

T1 - Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays

AU - Mohanty, Abinash

AU - Du, Xiaocong

AU - Chen, Pai Yu

AU - Seo, Jae-sun

AU - Yu, Shimeng

AU - Cao, Yu

PY - 2018/1/23

Y1 - 2018/1/23

N2 - An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to onchip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, -5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the backpropagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10-100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.

AB - An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to onchip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, -5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the backpropagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10-100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.

UR - http://www.scopus.com/inward/record.url?scp=85045207131&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045207131&partnerID=8YFLogxK

U2 - 10.1109/IEDM.2017.8268339

DO - 10.1109/IEDM.2017.8268339

M3 - Conference contribution

AN - SCOPUS:85045207131

VL - Part F134366

SP - 6.3.1-6.3.4

BT - 2017 IEEE International Electron Devices Meeting, IEDM 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 63rd IEEE International Electron Devices Meeting, IEDM 2017

Y2 - 2 December 2017 through 6 December 2017

ER -

Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays

Abstract

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this