Structured Pruning of RRAM Crossbars for Efficient In-Memory Computing Acceleration of Deep Neural Networks

Jian Meng, Li Yang, Xiaochen Peng, Shimeng Yu, Deliang Fan, Jae Sun Seo

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

The high computational complexity and a large number of parameters of deep neural networks (DNNs) become the most intensive burden of deep learning hardware design, limiting efficient storage and deployment. With the advantage of high-density storage, non-volatility, and low energy consumption, resistive RAM (RRAM) crossbar based in-memory computing (IMC) has emerged as a promising technique for DNN acceleration. To fully exploit crossbar-based IMC efficiency, a systematic compression design that considers both hardware and algorithm is necessary. In this brief, we present a system-level design considering the low precision weight and activation, structured pruning, and RRAM crossbar mapping. The proposed multi-group Lasso algorithm and hardware implementations have been evaluated on ResNet/VGG models for CIFAR-10/ImageNet datasets. With the fully quantized 4-bit ResNet-18 for CIFAR-10, we achieve up to 65.4times compression compared to full-precision software baseline, and 7times energy reduction compared to the 4-bit unpruned RRAM IMC hardware with 1.1% accuracy loss. For the fully quantized 4-bit ResNet-18 model for ImageNet dataset, we achieve up to 10.9times structured compression with 1.9% accuracy degradation.

Original languageEnglish (US)
Article number9387391
Pages (from-to)1576-1580
Number of pages5
JournalIEEE Transactions on Circuits and Systems II: Express Briefs
Volume68
Issue number5
DOIs
StatePublished - May 2021

Keywords

  • Convolutional neural networks
  • hardware accelerator
  • in-memory computing
  • resistive RAM
  • structured pruning

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Structured Pruning of RRAM Crossbars for Efficient In-Memory Computing Acceleration of Deep Neural Networks'. Together they form a unique fingerprint.

Cite this