TY - GEN
T1 - Joint Optimization of Quantization and Structured Sparsity for Compressed Deep Neural Networks
AU - Srivastava, Gaurav
AU - Kadetotad, Deepak
AU - Yin, Shihui
AU - Berisha, Visar
AU - Chakrabarti, Chaitali
AU - Seo, Jae Sun
N1 - Funding Information:
This work is in part supported by NSF grant 1652866, Sam-sung, C-BRIC, one of six centers in JUMP, a SRC program sponsored by DARPA, and ONR grant N00014-17-1-2826.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - The usage of Deep Neural Networks (DNN) on resource-constrained edge devices has been limited due to their high computation and large memory requirement. In this work, we propose an algorithm to compress DNNs by jointly optimizing structured sparsity and quantization constraints in a single DNN training framework. The proposed algorithm has been extensively validated on high/low capacity DNNs and wide/deep sparse DNNs. Further, we perform Pareto-optimal analysis to extract optimal DNN models from a large set of trained DNN models. The optimal structurally-compressed DNN model achieves ~50X weight memory reduction without test accuracy degradation, compared to floating-point uncompressed DNN.
AB - The usage of Deep Neural Networks (DNN) on resource-constrained edge devices has been limited due to their high computation and large memory requirement. In this work, we propose an algorithm to compress DNNs by jointly optimizing structured sparsity and quantization constraints in a single DNN training framework. The proposed algorithm has been extensively validated on high/low capacity DNNs and wide/deep sparse DNNs. Further, we perform Pareto-optimal analysis to extract optimal DNN models from a large set of trained DNN models. The optimal structurally-compressed DNN model achieves ~50X weight memory reduction without test accuracy degradation, compared to floating-point uncompressed DNN.
UR - http://www.scopus.com/inward/record.url?scp=85068959788&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068959788&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682791
DO - 10.1109/ICASSP.2019.8682791
M3 - Conference contribution
AN - SCOPUS:85068959788
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1393
EP - 1397
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -