Inference engine benchmarking across technological platforms from CMOS to RRAM

Xiaochen Peng, Minkyu Kim, Xiaoyu Sun, Shihui Yin, Titash Rakshit, Ryan M. Hatcher, Jorge A. Kittl, Jae sun Seo, Shimeng Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

State-of-the-art deep convolutional neural networks (CNNs) are widely used in current AI systems, and achieve remarkable success in image/speech recognition and classification. A number of recent efforts have attempted to design custom inference engine based on various approaches, including the systolic architecture, near memory processing, and processing-in-memory (PIM) approach with emerging technologies such as resistive random access memory (RRAM). However, a comprehensive comparison of these various approaches in a unified framework is missing, and the benefits of new designs or emerging technologies are mostly based on qualitative projections. In this paper, we evaluate the energy efficiency and frame rate for a VGG-like CNN inference accelerator on CIFAR-10 dataset across the technological platforms from CMOS to post-CMOS, with hardware resource constraint, i.e. comparable on-chip area. We also investigate the effects of off-chip memory DRAM access and interconnect during data movement, which are the bottlenecks of CMOS platforms. Our quantitative analysis shows that the peripheries (ADCs) dominate in energy consumption and area (rather than memory array) in digital RRAM-based parallel readout PIM architecture. Despite presence of ADCs, this architecture shows >2.5× improvement in energy efficiency (TOPS/W) over systolic arrays or near memory processing, with a comparable frame rate due to reduced DRAM access, high throughput and optimized parallel read out. Further >10× improvements can be achieved by implementing bit-count reduced XNOR network and pipelining.

Original languageEnglish (US)
Title of host publicationMEMSYS 2019 - Proceedings of the International Symposium on Memory Systems
PublisherAssociation for Computing Machinery
Pages471-479
Number of pages9
ISBN (Electronic)9781450372060
DOIs
StatePublished - Sep 30 2019
Event2019 International Symposium on Memory Systems, MEMSYS 2019 - Washington, United States
Duration: Sep 30 2019Oct 3 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2019 International Symposium on Memory Systems, MEMSYS 2019
CountryUnited States
CityWashington
Period9/30/1910/3/19

    Fingerprint

Keywords

  • Deep convolutional neural network
  • Hardware accelerator
  • Near memory processing
  • Processing in memory
  • Resistive random access memory
  • Systolic architecture

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Peng, X., Kim, M., Sun, X., Yin, S., Rakshit, T., Hatcher, R. M., Kittl, J. A., Seo, J. S., & Yu, S. (2019). Inference engine benchmarking across technological platforms from CMOS to RRAM. In MEMSYS 2019 - Proceedings of the International Symposium on Memory Systems (pp. 471-479). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3357526.3357566