Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks

Shihui Yin, Zhewei Jiang, Minkyu Kim, Tushar Gupta, Mingoo Seok, Jae Sun Seo

Research output: Contribution to journalArticle

Abstract

To enable essential deep learning computation on energy-constrained hardware platforms, including mobile, wearable, and Internet of Things (IoT) devices, a number of digital ASIC designs have presented customized dataflow and enhanced parallelism. However, in conventional digital designs, the biggest bottleneck for energy-efficient deep neural networks (DNNs) has reportedly been the data access and movement. To eliminate the storage access bottleneck, new SRAM macros that support in-memory computing have been recently demonstrated. Several in-SRAM computing works have used the mix of analog and digital circuits to perform XNOR-and-ACcumulate (XAC) operation without row-by-row memory access and can map a subset of DNNs with binary weights and binary activations. In the single array level, large improvement in energy efficiency (e.g., two orders of magnitude improvement) has been reported in computing XAC over digital-only hardware performing the same operation. In this article, by integrating many instances of such in-memory computing SRAM macros with an ensemble of peripheral digital circuits, we architect a new DNN accelerator, titled Vesti. This new accelerator is designed to support configurable multibit activations and large-scale DNNs seamlessly while substantially improving the chip-level energy-efficiency with favorable accuracy tradeoff compared to conventional digital ASIC. Vesti also employs double-buffering with two groups of in-memory computing SRAMs, effectively hiding the row-by-row write latencies of in-memory computing SRAMs. The Vesti accelerator is fully designed and laid out in 65-nm CMOS, demonstrating ultralow energy consumption of <20 nJ for MNIST classification and < 40~ \mu \text{J} for CIFAR-10 classification at 1.0-V supply.

Original languageEnglish (US)
Article number8867863
Pages (from-to)48-61
Number of pages14
JournalIEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume28
Issue number1
DOIs
Publication statusPublished - Jan 2020

    Fingerprint

Keywords

  • Deep learning accelerator
  • deep neural networks (DNNs)
  • double-buffering
  • in-memory computing
  • SRAM

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this