Accelerating deep neural network computation on a low power reconfigurable architecture

Y. Xiong, J. Zhou, S. Pal, D. Blaauw, H. S. Kim, T. Mudge, R. Dreslinski, C. Chakrabarti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Recent work on neural network architectures has focused on bridging the gap between performance/efficiency and programmability. We consider implementations of three popular neural networks, ResNet, AlexNet and ASGD weight-dropped Recurrent Neural Network (AWD RNN) on a low power programmable architecture, Transformer. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between shared and private cache mode operations. We present efficient implementations of key neural network kernels and evaluate the performance of each kernel when operating in different cache modes. The best-performing cache modes are then used in the implementation of the end-to-end network. Simulation results show superior performance with ResNet, AlexNet and AWD RNN achieving 188.19 GOPS/W, 150.53 GOPS/W and 120.68 GOPS/W, respectively, in the 14 nm technology node.

Original languageEnglish (US)
Title of host publication2020 IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728133201
StatePublished - 2020
Externally publishedYes
Event52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Virtual, Online
Duration: Oct 10 2020Oct 21 2020

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2020-October
ISSN (Print)0271-4310

Conference

Conference52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020
CityVirtual, Online
Period10/10/2010/21/20

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Accelerating deep neural network computation on a low power reconfigurable architecture'. Together they form a unique fingerprint.

Cite this