Abstract

Long short-term memory (LSTM) networks are widely used for speech applications but pose difficulties for efficient implementation on hardware due to large weight storage requirements. We present an energy-efficient LSTM recurrent neural network (RNN) accelerator,featuring an algorithm-hardware co-optimized memory compression technique called hierarchical coarse-grain sparsity (HCGS). Aided by HCGS-based block-wise recursive weight compression,we demonstrate LSTM networks with up to 16× fewer weights while achieving minimal accuracy loss. The prototype chip fabricated in 65-nm LP CMOS achieves 8.93/7.22 TOPS/W for 2-/3-layer LSTM RNNs trained with HCGS for TIMIT/TED-LIUM corpora.

Original languageEnglish (US)
Title of host publicationESSCIRC 2019 - IEEE 45th European Solid State Circuits Conference
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages119-122
Number of pages4
ISBN (Electronic)9781728115504
DOIs
StatePublished - Sep 2019
Event45th IEEE European Solid State Circuits Conference, ESSCIRC 2019 - Cracow, Poland
Duration: Sep 23 2019Sep 26 2019

Publication series

NameESSCIRC 2019 - IEEE 45th European Solid State Circuits Conference

Conference

Conference45th IEEE European Solid State Circuits Conference, ESSCIRC 2019
CountryPoland
CityCracow
Period9/23/199/26/19

Keywords

  • Hardware accelerator
  • long short-term memory (LSTM)
  • speech recognition
  • structured sparsity weight compression

ASJC Scopus subject areas

  • Instrumentation
  • Electronic, Optical and Magnetic Materials
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A 8.93-TOPS/W LSTM Recurrent Neural Network Accelerator Featuring Hierarchical Coarse-Grain Sparsity with All Parameters Stored On-Chip'. Together they form a unique fingerprint.

Cite this