TY - JOUR
T1 - Power, performance, and area benefit of monolithic 3D ICs for on-chip deep neural networks targeting speech recognition
AU - Chang, Kyungwook
AU - Kadetotad, Deepak
AU - Cao, Yu
AU - Seo, Jae-sun
AU - Kyu Lim, Sung
N1 - Funding Information:
This work was in part supported by the National Science Foundation grants 1652866 and 1715443, and C-BRIC, one of six centers in JUMP, a SRC program sponsored by DARPA.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/11
Y1 - 2018/11
N2 - In recent years, deep learning has become widespread for various real-world recognition tasks. In addition to recognition accuracy, energy efficiency and speed (i.e., performance) are other grand challenges to enable local intelligence in edge devices. In this article, we investigate the adoption of monolithic three-dimensional (3D) IC (M3D) technology for deep learning hardware design, using speech recognition as a test vehicle. M3D has recently proven to be one of the leading contenders to address the power, performance, and area (PPA) scaling challenges in advanced technology nodes. Our study encompasses the influence of key parameters in DNN hardware implementations towards their performance and energy efficiency, including DNN architectural choices, underlying workloads, and tier partitioning choices in M3D designs. Our post-layout M3D designs, together with hardware-efficient sparse algorithms, produce power savings and performance improvement beyond what can be achieved using conventional 2D ICs. Experimental results show that M3D offers 22.3% iso-performance power saving and 6.2% performance improvement, convincingly demonstrating its entitlement as a solution for DNN ASICs. We further present architectural and physical design guidelines for M3D DNNs to maximize the benefits.
AB - In recent years, deep learning has become widespread for various real-world recognition tasks. In addition to recognition accuracy, energy efficiency and speed (i.e., performance) are other grand challenges to enable local intelligence in edge devices. In this article, we investigate the adoption of monolithic three-dimensional (3D) IC (M3D) technology for deep learning hardware design, using speech recognition as a test vehicle. M3D has recently proven to be one of the leading contenders to address the power, performance, and area (PPA) scaling challenges in advanced technology nodes. Our study encompasses the influence of key parameters in DNN hardware implementations towards their performance and energy efficiency, including DNN architectural choices, underlying workloads, and tier partitioning choices in M3D designs. Our post-layout M3D designs, together with hardware-efficient sparse algorithms, produce power savings and performance improvement beyond what can be achieved using conventional 2D ICs. Experimental results show that M3D offers 22.3% iso-performance power saving and 6.2% performance improvement, convincingly demonstrating its entitlement as a solution for DNN ASICs. We further present architectural and physical design guidelines for M3D DNNs to maximize the benefits.
KW - High performance design
KW - Low power design
KW - Monolithic 3D IC
KW - On-chip deep neural networks
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85057725846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057725846&partnerID=8YFLogxK
U2 - 10.1145/3273956
DO - 10.1145/3273956
M3 - Article
AN - SCOPUS:85057725846
SN - 1550-4832
VL - 14
JO - ACM Journal on Emerging Technologies in Computing Systems
JF - ACM Journal on Emerging Technologies in Computing Systems
IS - 4
M1 - a42
ER -