Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications

Deepak Kadetotad, Sairam Arunachalam, Chaitali Chakrabarti, Jae-sun Seo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Scopus citations

Abstract

Recent breakthroughs in deep neural networks have led to the proliferation of its use in image and speech applications. Conventional deep neural networks (DNNs) are fully-connected multi-layer networks with hundreds or thousands of neurons in each layer. Such a network requires a very large weight memory to store the connectivity between neurons. In this paper, we propose a hardware-centric methodology to design low power neural networks with significantly smaller memory footprint and computation resource requirements. We achieve this by judiciously dropping connections in large blocks of weights. The corresponding technique, termed coarse-grain sparsification (CGS), introduces hardware-aware sparsity during the DNN training, which leads to efficient weight memory compression and significant computation reduction during classification without losing accuracy. We apply the proposed approach to DNN design for keyword detection and speech recognition. When the two DNNs are trained with 75% of the weights dropped and classified with 5-6 bit weight precision, the weight memory requirement is reduced by 95% compared to their fully-connected counterparts with double precision, while maintaining similar performance in keyword detection accuracy, word error rate, and sentence error rate. To validate this technique in real hardware, a time-multiplexed architecture using a shared multiply and accumulate (MAC) engine was implemented in 65nm and 40nm low power (LP) CMOS. In 40nm at 0.6V, the keyword detection network consumes 7μW and the speech recognition network consumes 103μW, making this technique highly suitable for mobile and wearable devices.

Original languageEnglish (US)
Title of host publication2016 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781450344661
DOIs
StatePublished - Nov 7 2016
Event35th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2016 - Austin, United States
Duration: Nov 7 2016Nov 10 2016

Publication series

NameIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
Volume07-10-November-2016
ISSN (Print)1092-3152

Other

Other35th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2016
Country/TerritoryUnited States
CityAustin
Period11/7/1611/10/16

Keywords

  • deep neural networks
  • keyword detection
  • low power design
  • memory compression
  • speech recognition

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications'. Together they form a unique fingerprint.

Cite this