TY - GEN
T1 - Optimize deep convolutional neural network with ternarized weights and high accuracy
AU - He, Zhezhi
AU - Gong, Boqing
AU - Fan, Deliang
N1 - Funding Information:
Acknowledgement This work is supported in part by the National Science Foundation under Grant No. 1740126 and Semiconductor Research Corporation nCORE.
Publisher Copyright:
© 2019 IEEE
PY - 2019/3/4
Y1 - 2019/3/4
N2 - Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource limited embedded systems. As the countermeasure to this problem, in this work, we propose statistical weight scaling and residual expansion methods to reduce the bit-width of the whole network weight parameters to ternary values (i.e. -1, 0, +1), with the objectives to greatly reduce model size, computation cost and accuracy degradation caused by the model compression. With about 16× model compression rate, our ternarized ResNet-32/44/56 could outperforms full-precision counterparts by 0.12%, 0.24% and 0.18% on CIFAR-10 dataset. We also test our ternarization method with AlexNet and ResNet-18 on ImageNet dataset, which both achieve the best top-1 accuracy compared to recent similar works, with the same 16× compression rate. If further incorporating our residual expansion method, compared to the full-precision counterpart, our ternarized ResNet-18 even improves the top-5 accuracy by 0.61% and merely degrades the top-1 accuracy only by 0.42% for ImageNet dataset, with 8× model compression rate. It outperforms the recent ABC-Net by 1.03% in top-1 accuracy and 1.78% in top-5 accuracy, with around 1.25× higher compression rate and more than 6× computation reduction due to the weight sparsity.
AB - Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource limited embedded systems. As the countermeasure to this problem, in this work, we propose statistical weight scaling and residual expansion methods to reduce the bit-width of the whole network weight parameters to ternary values (i.e. -1, 0, +1), with the objectives to greatly reduce model size, computation cost and accuracy degradation caused by the model compression. With about 16× model compression rate, our ternarized ResNet-32/44/56 could outperforms full-precision counterparts by 0.12%, 0.24% and 0.18% on CIFAR-10 dataset. We also test our ternarization method with AlexNet and ResNet-18 on ImageNet dataset, which both achieve the best top-1 accuracy compared to recent similar works, with the same 16× compression rate. If further incorporating our residual expansion method, compared to the full-precision counterpart, our ternarized ResNet-18 even improves the top-5 accuracy by 0.61% and merely degrades the top-1 accuracy only by 0.42% for ImageNet dataset, with 8× model compression rate. It outperforms the recent ABC-Net by 1.03% in top-1 accuracy and 1.78% in top-5 accuracy, with around 1.25× higher compression rate and more than 6× computation reduction due to the weight sparsity.
UR - http://www.scopus.com/inward/record.url?scp=85063567668&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063567668&partnerID=8YFLogxK
U2 - 10.1109/WACV.2019.00102
DO - 10.1109/WACV.2019.00102
M3 - Conference contribution
AN - SCOPUS:85063567668
T3 - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019
SP - 913
EP - 921
BT - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019
Y2 - 7 January 2019 through 11 January 2019
ER -