The recently reported successes of convolutional neural networks (CNNs) in many areas have generated wide interest in the development of field-programmable gate array (FPGA)-based accelerators. To achieve high performance and energy efficiency, an FPGA-based accelerator must fully utilize the limited computation resources and minimize the data communication and memory access, both of which are impacted and constrained by a variety of design parameters, e.g., the degree and dimension of parallelism, the size of on-chip buffers, the bandwidth of the external memory, and many more. The large design space of the accelerator makes it impractical to search for the optimal design in the implementation phase. To address this problem, a performance model is described to estimate the performance and resource utilization of an FPGA implementation. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase. The proposed performance model is validated using a variety of CNN algorithms comparing the results with on-board test results on two different FPGAs.

Original languageEnglish (US)
Article number8634939
Pages (from-to)843-856
Number of pages14
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Issue number4
StatePublished - Apr 2020


  • Analytical modeling
  • convolutional neural networks (CNNs)
  • field-programmable gate array (FPGA)

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering


Dive into the research topics of 'Performance modeling for CNN inference accelerators on FPGA'. Together they form a unique fingerprint.

Cite this