FixyFPGA: Efficient FPGA accelerator for deep neural networks with high element-wise sparsity and without external memory access

Jian Meng, Shreyas Kolala Venkataramanaiah, Chuteng Zhou, Patrick Hansen, Paul Whatmough, Jae Sun Seo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Convolutional neural networks (CNNs) have become very popular in real-time computer vision systems. CNNs involve a large amount of computation and storage and typically demand a highly efficient computing platform. Researchers have explored a diverse range of software and hardware optimizations to accelerate CNN inference in recent years. The high power consumption of GPUs and the lack of flexibility with ASIC has promoted interest in FPGAs as a promising platform to efficiently accelerate these CNN inference tasks. Various FPGA-based CNN accelerators have been proposed to low precision weights and high-sparsity in various forms. However, most of the previous work requires off-chip DDR memory to store the parameters and expensive DSP blocks to perform the computation. In this work, we propose the FixyFPGA, a fully on-chip CNN inference accelerator that naturally supports high-sparsity and low-precision computation. In our design, the weights of the trained CNN network are hard-coded into hardware and used as fixed operand for the multiplication. Convolution is performed by streaming the input images to the compute engine in a fully-paralleled, fully-pipelined manner. We analyzed the performance of the proposed scheme with both image classification tasks and object detection tasks based on the low precision, sparse compact CNN models. Compared to prior works, our design achieved 2.34x higher GOPS on ImageNet classification and 3.82x higher frames per second on Pascal VOC object detection.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 31st International Conference on Field-Programmable Logic and Applications, FPL 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages9-16
Number of pages8
ISBN (Electronic)9781665437592
DOIs
StatePublished - 2021
Event31st International Conference on Field-Programmable Logic and Applications, FPL 2021 - Virtual, Dresden, Germany
Duration: Aug 30 2021Sep 3 2021

Publication series

NameProceedings - 2021 31st International Conference on Field-Programmable Logic and Applications, FPL 2021

Conference

Conference31st International Conference on Field-Programmable Logic and Applications, FPL 2021
Country/TerritoryGermany
CityVirtual, Dresden
Period8/30/219/3/21

Keywords

  • Convolution neural networks
  • FPGA
  • Hardware accelerator
  • Low-precision quantization
  • Pruning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'FixyFPGA: Efficient FPGA accelerator for deep neural networks with high element-wise sparsity and without external memory access'. Together they form a unique fingerprint.

Cite this