A Flexible Processing-in-Memory Accelerator for Dynamic Channel-Adaptive Deep Neural Networks

Li Yang, Shaahin Angizi, Deliang Fan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the success of deep neural networks (DNN), many recent works have been focusing on developing hardware accelerator for power and resource-limited embedded system via model compression techniques, such as quantization, pruning, low-rank approximation, etc. However, almost all existing DNN structure is fixed after deployment, which lacks runtime adaptive DNN structure to adapt to its dynamic hardware resource, power budget, throughput requirement, as well as dynamic workload. Correspondingly, there is no runtime adaptive hardware platform to support dynamic DNN structure. To address this problem, we first propose a dynamic channel-adaptive deep neural network (CA-DNN) which can adjust the involved convolution channel (i.e. model size, computing load) at run-time (i.e. at inference stage without retraining) to dynamically trade off between power, speed, computing load and accuracy. Further, we utilize knowledge distillation method to optimize the model and quantize the model to 8-bits and 16-bits, respectively, for hardware friendly mapping. We test the proposed model on CIFAR-10 and ImageNet dataset by using ResNet. Comparing with the same model size of individual model, our CA-DNN achieves better accuracy. Moreover, as far as we know, we are the first to propose a Processing-in-Memory accelerator for such adaptive neural networks structure based on Spin Orbit Torque Magnetic Random Access Memory(SOT-MRAM) computational adaptive sub-arrays. Then, we comprehensively analyze the trade-off of the model with different channel-width between the accuracy and the hardware parameters, eg., energy, memory, and area overhead.

Original languageEnglish (US)
Title of host publicationASP-DAC 2020 - 25th Asia and South Pacific Design Automation Conference, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages313-318
Number of pages6
ISBN (Electronic)9781728141237
DOIs
StatePublished - Jan 2020
Event25th Asia and South Pacific Design Automation Conference, ASP-DAC 2020 - Beijing, China
Duration: Jan 13 2020Jan 16 2020

Publication series

NameProceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
Volume2020-January

Conference

Conference25th Asia and South Pacific Design Automation Conference, ASP-DAC 2020
CountryChina
CityBeijing
Period1/13/201/16/20

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint Dive into the research topics of 'A Flexible Processing-in-Memory Accelerator for Dynamic Channel-Adaptive Deep Neural Networks'. Together they form a unique fingerprint.

Cite this