TY - GEN
T1 - FixyFPGA
T2 - 31st International Conference on Field-Programmable Logic and Applications, FPL 2021
AU - Meng, Jian
AU - Venkataramanaiah, Shreyas Kolala
AU - Zhou, Chuteng
AU - Hansen, Patrick
AU - Whatmough, Paul
AU - Seo, Jae Sun
N1 - Funding Information:
This work was in part supported NSF grant 1652866 and C-BRIC, one of six centers in JUMP, a SRC program sponsored by DARPA.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Convolutional neural networks (CNNs) have become very popular in real-time computer vision systems. CNNs involve a large amount of computation and storage and typically demand a highly efficient computing platform. Researchers have explored a diverse range of software and hardware optimizations to accelerate CNN inference in recent years. The high power consumption of GPUs and the lack of flexibility with ASIC has promoted interest in FPGAs as a promising platform to efficiently accelerate these CNN inference tasks. Various FPGA-based CNN accelerators have been proposed to low precision weights and high-sparsity in various forms. However, most of the previous work requires off-chip DDR memory to store the parameters and expensive DSP blocks to perform the computation. In this work, we propose the FixyFPGA, a fully on-chip CNN inference accelerator that naturally supports high-sparsity and low-precision computation. In our design, the weights of the trained CNN network are hard-coded into hardware and used as fixed operand for the multiplication. Convolution is performed by streaming the input images to the compute engine in a fully-paralleled, fully-pipelined manner. We analyzed the performance of the proposed scheme with both image classification tasks and object detection tasks based on the low precision, sparse compact CNN models. Compared to prior works, our design achieved 2.34x higher GOPS on ImageNet classification and 3.82x higher frames per second on Pascal VOC object detection.
AB - Convolutional neural networks (CNNs) have become very popular in real-time computer vision systems. CNNs involve a large amount of computation and storage and typically demand a highly efficient computing platform. Researchers have explored a diverse range of software and hardware optimizations to accelerate CNN inference in recent years. The high power consumption of GPUs and the lack of flexibility with ASIC has promoted interest in FPGAs as a promising platform to efficiently accelerate these CNN inference tasks. Various FPGA-based CNN accelerators have been proposed to low precision weights and high-sparsity in various forms. However, most of the previous work requires off-chip DDR memory to store the parameters and expensive DSP blocks to perform the computation. In this work, we propose the FixyFPGA, a fully on-chip CNN inference accelerator that naturally supports high-sparsity and low-precision computation. In our design, the weights of the trained CNN network are hard-coded into hardware and used as fixed operand for the multiplication. Convolution is performed by streaming the input images to the compute engine in a fully-paralleled, fully-pipelined manner. We analyzed the performance of the proposed scheme with both image classification tasks and object detection tasks based on the low precision, sparse compact CNN models. Compared to prior works, our design achieved 2.34x higher GOPS on ImageNet classification and 3.82x higher frames per second on Pascal VOC object detection.
KW - Convolution neural networks
KW - FPGA
KW - Hardware accelerator
KW - Low-precision quantization
KW - Pruning
UR - http://www.scopus.com/inward/record.url?scp=85121955536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121955536&partnerID=8YFLogxK
U2 - 10.1109/FPL53798.2021.00010
DO - 10.1109/FPL53798.2021.00010
M3 - Conference contribution
AN - SCOPUS:85121955536
T3 - Proceedings - 2021 31st International Conference on Field-Programmable Logic and Applications, FPL 2021
SP - 9
EP - 16
BT - Proceedings - 2021 31st International Conference on Field-Programmable Logic and Applications, FPL 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 30 August 2021 through 3 September 2021
ER -