A Progressive Subnetwork Searching Framework for Dynamic Inference

Li Yang, Zhezhi He, Yu Cao, Deliang Fan

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Deep neural network (DNN) model compression is a popular and important optimization method for efficient and fast hardware acceleration. However, the compressed model is usually fixed, without the capability to tune the computing complexity (i.e., latency in hardware) on-the-fly, depending on dynamic latency requirements, workloads, and computing hardware resource allocation. To address this challenge, dynamic DNN with run-time adaption of computing structures has been constructed through training with a cross-entropy objective function consisting of multiple subnets sampled from the supernet. Our investigations in this work show that the performance of dynamic inference highly relies on the quality of subnet sampling. To construct a dynamic DNN with multiple high-quality subnets, we propose a progressive subnetwork searching framework, which is embedded with several proposed new techniques, including <italic>trainable noise ranking</italic>, <italic>channel-group sampling</italic>, <italic>selective fine-tuning</italic>, and <italic>subnet filtering</italic>. Our proposed framework empowers the target dynamic DNN with higher accuracy for all the subnets compared with prior works on both the Canadian Institute for Advanced Research dataset with 10 classes (CIFAR-10) and ImageNet datasets. Specifically, compared with United States-Neural Network (US-NN), our method achieves 0.9% average accuracy gain for Alexnet, 2.5% for ResNet18, 1.1% for Visual Geometry Group (VGG)11, and 0.58% for MobileNetv1, on the ImageNet dataset, respectively. Moreover, to demonstrate run-time tuning of computing latency of dynamic DNN in real computing system, we have deployed our constructed dynamic networks into Nvidia Titan graphics processing unit (GPU) and Intel Xeon central processing unit (CPU), showing great improvement over prior works. The code is available at https://github.com/ASU-ESIC-FAN-Lab/Dynamic-inference.

Original languageEnglish (US)
Pages (from-to)1-12
Number of pages12
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
StateAccepted/In press - 2022
Externally publishedYes

Keywords

  • Computational modeling
  • Costs
  • Deep neural network (DNN)
  • dynamic inference
  • dynamic network
  • Dynamic scheduling
  • Graphics processing units
  • Hardware
  • Neural networks
  • subnet searching
  • Switches

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A Progressive Subnetwork Searching Framework for Dynamic Inference'. Together they form a unique fingerprint.

Cite this