Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency

Li Yang, Deliang Fan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To deploy powerful deep neural network (DNN) into smart, but resource limited IoT devices, many prior works have been proposed to compress DNN to reduce the network size and computation complexity with negligible accuracy degradation, such as weight quantization, network pruning, convolution decomposition, etc. However, by utilizing conventional DNN compression methods, a smaller, but fixed, network is generated from a relative large background model to achieve resource limited hardware acceleration. However, such optimization lacks the ability to adjust its structure in real-time to adapt for a dynamic computing hardware resource allocation and workloads. In this paper, we mainly review our two prior works [13, 15] to tackle this challenge, discussing how to construct a dynamic DNN by means of either uniform or non-uniform sub-nets generation methods. Moreover, to generate multiple nonuniform sub-nets, [15] needs to fully retrain the background model for each sub-net individually, named as multi-path method. To reduce the training cost, in this work, we further propose a single-path sub-nets generation method that can sample multiple sub-nets in different epochs within one training round. The constructed dynamic DNN, consisting of multiple sub-nets, provides the ability to run-time trade-off the inference accuracy and latency according to hardware resources and environment requirements. In the end, we study the the dynamic DNNs with different sub-nets generation methods on both CIFAR-10 and ImageNet dataset. We also present the run-time tuning of accuracy and latency on both GPU and CPU.

Original languageEnglish (US)
Title of host publicationProceedings of the 26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages587-592
Number of pages6
ISBN (Electronic)9781450379991
DOIs
StatePublished - Jan 18 2021
Event26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021 - Virtual, Online, Japan
Duration: Jan 18 2021Jan 21 2021

Publication series

NameProceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Conference

Conference26th Asia and South Pacific Design Automation Conference, ASP-DAC 2021
Country/TerritoryJapan
CityVirtual, Online
Period1/18/211/21/21

Keywords

  • dynamic neural networks

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Dynamic Neural Network to Enable Run-Time Trade-off between Accuracy and Latency'. Together they form a unique fingerprint.

Cite this