Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

Gokul Krishnan; Zhenyu Wang; Injune Yeo; Li Yang; Jian Meng; Maximilian Liehr; Rajiv V. Joshi; Nathaniel C. Cady; Deliang Fan; Jae Sun Seo; Yu Cao

doi:10.1109/TCAD.2022.3197516

Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

Gokul Krishnan, Zhenyu Wang, Injune Yeo, Li Yang, Jian Meng, Maximilian Liehr, Rajiv V. Joshi, Nathaniel C. Cady, Deliang Fan, Jae Sun Seo, Yu Cao

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.

Original language	English (US)
Pages (from-to)	4241-4252
Number of pages	12
Journal	IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume	41
Issue number	11
DOIs	https://doi.org/10.1109/TCAD.2022.3197516
State	Published - Nov 1 2022

Keywords

In-memory compute
RRAM
SRAM
robust deep neural network (DNN) acceleration

ASJC Scopus subject areas

Software
Computer Graphics and Computer-Aided Design
Electrical and Electronic Engineering

Access to Document

10.1109/TCAD.2022.3197516

Cite this

@article{d315308f3d564c78bc177dda1d5f8aa4,

title = "Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration",

abstract = "RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.",

keywords = "In-memory compute, RRAM, SRAM, robust deep neural network (DNN) acceleration",

author = "Gokul Krishnan and Zhenyu Wang and Injune Yeo and Li Yang and Jian Meng and Maximilian Liehr and Joshi, {Rajiv V.} and Cady, {Nathaniel C.} and Deliang Fan and Seo, {Jae Sun} and Yu Cao",

note = "Publisher Copyright: {\textcopyright} 1982-2012 IEEE.",

year = "2022",

month = nov,

day = "1",

doi = "10.1109/TCAD.2022.3197516",

language = "English (US)",

volume = "41",

pages = "4241--4252",

journal = "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems",

issn = "0278-0070",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "11",

}

TY - JOUR

T1 - Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

AU - Krishnan, Gokul

AU - Wang, Zhenyu

AU - Yeo, Injune

AU - Yang, Li

AU - Meng, Jian

AU - Liehr, Maximilian

AU - Joshi, Rajiv V.

AU - Cady, Nathaniel C.

AU - Fan, Deliang

AU - Seo, Jae Sun

AU - Cao, Yu

PY - 2022/11/1

Y1 - 2022/11/1

N2 - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.

AB - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.

KW - In-memory compute

KW - RRAM

KW - SRAM

KW - robust deep neural network (DNN) acceleration

UR - http://www.scopus.com/inward/record.url?scp=85136058903&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85136058903&partnerID=8YFLogxK

U2 - 10.1109/TCAD.2022.3197516

DO - 10.1109/TCAD.2022.3197516

M3 - Article

AN - SCOPUS:85136058903

SN - 0278-0070

VL - 41

SP - 4241

EP - 4252

JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

IS - 11

ER -

Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this