Abstract

The rapid improvement in computation capability has made convolutional neural networks (CNNs) a great success in recent years on image classification tasks, which has also prospered the development of objection detection algorithms with significantly improved accuracy. However, during the deployment phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the inference algorithm. Therefore, this work proposes to customize the detection algorithm, e.g. SSD, to benefit its hardware implementation with low data precision at the cost of marginal accuracy degradation. The proposed FPGA-based deep learning inference accelerator is demonstrated on two Intel FPGAs for SSD algorithm achieving up to 2.18 TOPS throughput and up to 3.3X superior energy-efficiency compared to GPU.

Original languageEnglish (US)
Title of host publication2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781450359504
DOIs
StatePublished - Nov 5 2018
Event37th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - San Diego, United States
Duration: Nov 5 2018Nov 8 2018

Other

Other37th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018
CountryUnited States
CitySan Diego
Period11/5/1811/8/18

Fingerprint

Field programmable gate arrays (FPGA)
Detectors
Hardware
Image classification
Digital circuits
Particle accelerators
Energy efficiency
Electric power utilization
Throughput
Neural networks
Degradation
Object detection
Processing
Graphics processing unit

Keywords

  • FPGA
  • hardware accelerator
  • HW/SW co-design
  • neural network

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Cite this

Ma, Y., Zheng, T., Cao, Y., Vrudhula, S., & Seo, J. (2018). Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs. In 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers [a57] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/3240765.3240775

Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs. / Ma, Yufei; Zheng, Tu; Cao, Yu; Vrudhula, Sarma; Seo, Jae-sun.

2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers. Institute of Electrical and Electronics Engineers Inc., 2018. a57.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ma, Y, Zheng, T, Cao, Y, Vrudhula, S & Seo, J 2018, Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs. in 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers., a57, Institute of Electrical and Electronics Engineers Inc., 37th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018, San Diego, United States, 11/5/18. https://doi.org/10.1145/3240765.3240775
Ma Y, Zheng T, Cao Y, Vrudhula S, Seo J. Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs. In 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers. Institute of Electrical and Electronics Engineers Inc. 2018. a57 https://doi.org/10.1145/3240765.3240775
Ma, Yufei ; Zheng, Tu ; Cao, Yu ; Vrudhula, Sarma ; Seo, Jae-sun. / Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs. 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers. Institute of Electrical and Electronics Engineers Inc., 2018.
@inproceedings{c70275d047ba48299dbfaa6373ad1f6d,
title = "Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs",
abstract = "The rapid improvement in computation capability has made convolutional neural networks (CNNs) a great success in recent years on image classification tasks, which has also prospered the development of objection detection algorithms with significantly improved accuracy. However, during the deployment phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the inference algorithm. Therefore, this work proposes to customize the detection algorithm, e.g. SSD, to benefit its hardware implementation with low data precision at the cost of marginal accuracy degradation. The proposed FPGA-based deep learning inference accelerator is demonstrated on two Intel FPGAs for SSD algorithm achieving up to 2.18 TOPS throughput and up to 3.3X superior energy-efficiency compared to GPU.",
keywords = "FPGA, hardware accelerator, HW/SW co-design, neural network",
author = "Yufei Ma and Tu Zheng and Yu Cao and Sarma Vrudhula and Jae-sun Seo",
year = "2018",
month = "11",
day = "5",
doi = "10.1145/3240765.3240775",
language = "English (US)",
booktitle = "2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Algorithm-hardware co-design of single shot detector for fast object detection on FPGAs

AU - Ma, Yufei

AU - Zheng, Tu

AU - Cao, Yu

AU - Vrudhula, Sarma

AU - Seo, Jae-sun

PY - 2018/11/5

Y1 - 2018/11/5

N2 - The rapid improvement in computation capability has made convolutional neural networks (CNNs) a great success in recent years on image classification tasks, which has also prospered the development of objection detection algorithms with significantly improved accuracy. However, during the deployment phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the inference algorithm. Therefore, this work proposes to customize the detection algorithm, e.g. SSD, to benefit its hardware implementation with low data precision at the cost of marginal accuracy degradation. The proposed FPGA-based deep learning inference accelerator is demonstrated on two Intel FPGAs for SSD algorithm achieving up to 2.18 TOPS throughput and up to 3.3X superior energy-efficiency compared to GPU.

AB - The rapid improvement in computation capability has made convolutional neural networks (CNNs) a great success in recent years on image classification tasks, which has also prospered the development of objection detection algorithms with significantly improved accuracy. However, during the deployment phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the inference algorithm. Therefore, this work proposes to customize the detection algorithm, e.g. SSD, to benefit its hardware implementation with low data precision at the cost of marginal accuracy degradation. The proposed FPGA-based deep learning inference accelerator is demonstrated on two Intel FPGAs for SSD algorithm achieving up to 2.18 TOPS throughput and up to 3.3X superior energy-efficiency compared to GPU.

KW - FPGA

KW - hardware accelerator

KW - HW/SW co-design

KW - neural network

UR - http://www.scopus.com/inward/record.url?scp=85058172945&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058172945&partnerID=8YFLogxK

U2 - 10.1145/3240765.3240775

DO - 10.1145/3240765.3240775

M3 - Conference contribution

AN - SCOPUS:85058172945

BT - 2018 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2018 - Digest of Technical Papers

PB - Institute of Electrical and Electronics Engineers Inc.

ER -