Automatic compiler based FPGA accelerator for CNN training

Shreyas Kolala Venkataramanaiah, Yufei Ma, Shihui Yin, Eriko Nurvithadhi, Aravind Dasu, Yu Cao, Jae Sun Seo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Training of convolutional neural networks (CNNs) on embedded platforms to support on-device learning is earning vital importance in recent days. Designing flexible training hardware is much more challenging than inference hardware, due to design complexity and large computation/memory requirement. In this work, we present an automatic compiler based FPGA accelerator with 16-bit fixed-point precision for complete CNN training, including Forward Pass (FP), Backward Pass (BP) and Weight Update (WU). We implemented an optimized RTL library to perform training-specific tasks and developed an RTL compiler to automatically generate FPGA-synthesizable RTL based on user-defined constraints. We present a new cyclic weight storage/access scheme for on-chip BRAM and off-chip DRAM to efficiently implement non-transpose and transpose operations during FP and BP phases, respectively. Representative CNNs for CIFAR-10 dataset are implemented and trained on Intel Stratix 10 GX FPGA using proposed hardware architecture, demonstrating up to 479 GOPS performance.

Original languageEnglish (US)
Title of host publicationProceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019
EditorsIoannis Sourdis, Christos-Savvas Bouganis, Carlos Alvarez, Leonel Antonio Toledo Diaz, Pedro Valero, Xavier Martorell
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages166-172
Number of pages7
ISBN (Electronic)9781728148847
DOIs
StatePublished - Sep 2019
Event29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019 - Barcelona, Spain
Duration: Sep 9 2019Sep 13 2019

Publication series

NameProceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019

Conference

Conference29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019
CountrySpain
CityBarcelona
Period9/9/199/13/19

Keywords

  • Back-propagation
  • Convolution neural networks
  • FPGA
  • Hardware accelerator
  • Neural network training

ASJC Scopus subject areas

  • Instrumentation
  • Artificial Intelligence
  • Computer Science Applications
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Automatic compiler based FPGA accelerator for CNN training'. Together they form a unique fingerprint.

Cite this