Deep Neural Network Training Accelerator Designs in ASIC and FPGA

Shreyas K. Venkataramanaiah; Shihui Yin; Yu Cao; Jae Sun Seo

doi:10.1109/ISOCC50952.2020.9333063

Deep Neural Network Training Accelerator Designs in ASIC and FPGA

Shreyas K. Venkataramanaiah, Shihui Yin, Yu Cao, Jae Sun Seo

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

14 Scopus citations

Abstract

In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-The-shelf SRAMs for non-Transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.

Original language	English (US)
Title of host publication	Proceedings - International SoC Design Conference, ISOCC 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	21-22
Number of pages	2
ISBN (Electronic)	9781728183312
DOIs	https://doi.org/10.1109/ISOCC50952.2020.9333063
State	Published - Oct 21 2020
Event	17th International System-on-Chip Design Conference, ISOCC 2020 - Yeosu, Korea, Republic of Duration: Oct 21 2020 → Oct 24 2020

Publication series

Name	Proceedings - International SoC Design Conference, ISOCC 2020

Conference

Conference	17th International System-on-Chip Design Conference, ISOCC 2020
Country/Territory	Korea, Republic of
City	Yeosu
Period	10/21/20 → 10/24/20

Keywords

convolutional neural networks
energy efficiency
hardware accelerator
on-device training

ASJC Scopus subject areas

Energy Engineering and Power Technology
Electrical and Electronic Engineering
Instrumentation
Artificial Intelligence
Hardware and Architecture

Access to Document

10.1109/ISOCC50952.2020.9333063

Cite this

Venkataramanaiah, S. K., Yin, S., Cao, Y., & Seo, J. S. (2020). Deep Neural Network Training Accelerator Designs in ASIC and FPGA. In Proceedings - International SoC Design Conference, ISOCC 2020 (pp. 21-22). Article 9333063 (Proceedings - International SoC Design Conference, ISOCC 2020). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISOCC50952.2020.9333063

Deep Neural Network Training Accelerator Designs in ASIC and FPGA. / Venkataramanaiah, Shreyas K.; Yin, Shihui; Cao, Yu et al.
Proceedings - International SoC Design Conference, ISOCC 2020. Institute of Electrical and Electronics Engineers Inc., 2020. p. 21-22 9333063 (Proceedings - International SoC Design Conference, ISOCC 2020).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Venkataramanaiah, SK, Yin, S, Cao, Y & Seo, JS 2020, Deep Neural Network Training Accelerator Designs in ASIC and FPGA. in Proceedings - International SoC Design Conference, ISOCC 2020., 9333063, Proceedings - International SoC Design Conference, ISOCC 2020, Institute of Electrical and Electronics Engineers Inc., pp. 21-22, 17th International System-on-Chip Design Conference, ISOCC 2020, Yeosu, Korea, Republic of, 10/21/20. https://doi.org/10.1109/ISOCC50952.2020.9333063

@inproceedings{017efbc0730d4d68a8f505be3845036b,

title = "Deep Neural Network Training Accelerator Designs in ASIC and FPGA",

abstract = "In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-The-shelf SRAMs for non-Transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results. ",

keywords = "convolutional neural networks, energy efficiency, hardware accelerator, on-device training",

author = "Venkataramanaiah, {Shreyas K.} and Shihui Yin and Yu Cao and Seo, {Jae Sun}",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 17th International System-on-Chip Design Conference, ISOCC 2020 ; Conference date: 21-10-2020 Through 24-10-2020",

year = "2020",

month = oct,

day = "21",

doi = "10.1109/ISOCC50952.2020.9333063",

language = "English (US)",

series = "Proceedings - International SoC Design Conference, ISOCC 2020",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "21--22",

booktitle = "Proceedings - International SoC Design Conference, ISOCC 2020",

}

TY - GEN

T1 - Deep Neural Network Training Accelerator Designs in ASIC and FPGA

AU - Venkataramanaiah, Shreyas K.

AU - Yin, Shihui

AU - Cao, Yu

AU - Seo, Jae Sun

PY - 2020/10/21

Y1 - 2020/10/21

N2 - In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-The-shelf SRAMs for non-Transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.

AB - In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-The-shelf SRAMs for non-Transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.

KW - convolutional neural networks

KW - energy efficiency

KW - hardware accelerator

KW - on-device training

UR - http://www.scopus.com/inward/record.url?scp=85100733311&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85100733311&partnerID=8YFLogxK

U2 - 10.1109/ISOCC50952.2020.9333063

DO - 10.1109/ISOCC50952.2020.9333063

M3 - Conference contribution

AN - SCOPUS:85100733311

T3 - Proceedings - International SoC Design Conference, ISOCC 2020

SP - 21

EP - 22

BT - Proceedings - International SoC Design Conference, ISOCC 2020

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 17th International System-on-Chip Design Conference, ISOCC 2020

Y2 - 21 October 2020 through 24 October 2020

ER -

Deep Neural Network Training Accelerator Designs in ASIC and FPGA

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this