TY - GEN
T1 - A fully onchip binarized convolutional neural network FPGA impelmentation with accurate inference
AU - Yang, Li
AU - He, Zhezhi
AU - Fan, Deliang
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/7/23
Y1 - 2018/7/23
N2 - Deep convolutional neural network has taken an important role in machine learning algorithm which has been widely used in computer vision tasks. However, its enormous model size and massive computation cost have became the main obstacle for deployment of such powerful algorithm in low power and resource limited embedded system, such as FPGA. Recent works have shown the binarized neural networks (BNN), utilizing binarized (i.e. +1 and -1) convolution kernel and binary activation function, can significantly reduce the model size and computation complexity, which paves a new road for energy-efficient FPGA implementation. In this work, we first propose a new BNN algorithm, called Parallel-Convolution BNN (i.e. PC-BNN), which replaces the original binary convolution layer in conventional BNN with two parallel binary convolution layers. PC-BNN achieves ∼86% on CIFAR-10 dataset with only 2.3Mb parameter size. We then deploy our proposed PC-BNN into the Xilinx PYNQ Z1 FPGA board with only 4.9Mb on-chip RAM. Since the ultra-small network parameter, it is feasible to store the whole network parameter into on-chip RAM, which could greatly reduce the energy and delay overhead to load network parameter from off-chip memory. Meanwhile, a new data streaming pipeline architecture is proposed in PC-BNN FPGA implementation to further improve throughput. The experiment results show that our PCBNN based FPGA implementation achieves 930 frames per second, 387.5 FPS/Watt and 396×10-4 FPS/LUT, which are among the best throughput and energy efficiency compared to most recent works.
AB - Deep convolutional neural network has taken an important role in machine learning algorithm which has been widely used in computer vision tasks. However, its enormous model size and massive computation cost have became the main obstacle for deployment of such powerful algorithm in low power and resource limited embedded system, such as FPGA. Recent works have shown the binarized neural networks (BNN), utilizing binarized (i.e. +1 and -1) convolution kernel and binary activation function, can significantly reduce the model size and computation complexity, which paves a new road for energy-efficient FPGA implementation. In this work, we first propose a new BNN algorithm, called Parallel-Convolution BNN (i.e. PC-BNN), which replaces the original binary convolution layer in conventional BNN with two parallel binary convolution layers. PC-BNN achieves ∼86% on CIFAR-10 dataset with only 2.3Mb parameter size. We then deploy our proposed PC-BNN into the Xilinx PYNQ Z1 FPGA board with only 4.9Mb on-chip RAM. Since the ultra-small network parameter, it is feasible to store the whole network parameter into on-chip RAM, which could greatly reduce the energy and delay overhead to load network parameter from off-chip memory. Meanwhile, a new data streaming pipeline architecture is proposed in PC-BNN FPGA implementation to further improve throughput. The experiment results show that our PCBNN based FPGA implementation achieves 930 frames per second, 387.5 FPS/Watt and 396×10-4 FPS/LUT, which are among the best throughput and energy efficiency compared to most recent works.
KW - Binarized convolutional neural network (BNN)
KW - Convolutional neural network (CNN)
KW - Field-programmable gate array (FPGA)
UR - http://www.scopus.com/inward/record.url?scp=85051514861&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051514861&partnerID=8YFLogxK
U2 - 10.1145/3218603.3218615
DO - 10.1145/3218603.3218615
M3 - Conference contribution
AN - SCOPUS:85051514861
SN - 9781450357043
T3 - Proceedings of the International Symposium on Low Power Electronics and Design
BT - ISLPED 2018 - Proceedings of the 2018 International Symposium on Low Power Electronics and Design
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2018
Y2 - 23 July 2018 through 25 July 2018
ER -