Efficient and modularized training on FPGA for real-time applications

Shreyas Kolala Venkataramanaiah, Xiaocong Du, Zheng Li, Shihui Yin, Yu Cao, Jae Sun Seo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Training of deep Convolution Neural Networks (CNNs) requires a tremendous amount of computation and memory and thus, GPUs are widely used to meet the computation demands of these complex training tasks. However, lacking the flexibility to exploit architectural optimizations, GPUs have poor energy efficiency of GPUs and are hard to be deployed on energy-constrained platforms. FPGAs are highly suitable for training, such as real-time learning at the edge, as they provide higher energy efficiency and better flexibility to support algorithmic evolution. This paper first develops a training accelerator on FPGA, with 16-bit fixed-point computing and various training modules. Furthermore, leveraging model segmentation techniques from Progressive Segmented Training, the newly developed FPGA accelerator is applied to online learning, achieving much lower computation cost. We demonstrate the performance of representative CNNs trained for CIFAR-10 on Intel Stratix-10 MX FPGA, evaluating both the conventional training procedure and the online learning algorithm. The demo is available at https://github.com/dxc33linger/PSTonFPGA demo.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020
EditorsChristian Bessiere
PublisherInternational Joint Conferences on Artificial Intelligence
Pages5237-5239
Number of pages3
ISBN (Electronic)9780999241165
StatePublished - 2020
Event29th International Joint Conference on Artificial Intelligence, IJCAI 2020 - Yokohama, Japan
Duration: Jan 1 2021 → …

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2021-January
ISSN (Print)1045-0823

Conference

Conference29th International Joint Conference on Artificial Intelligence, IJCAI 2020
Country/TerritoryJapan
CityYokohama
Period1/1/21 → …

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Efficient and modularized training on FPGA for real-time applications'. Together they form a unique fingerprint.

Cite this