TY - GEN
T1 - A Configurable BNN ASIC using a Network of Programmable Threshold Logic Standard Cells
AU - Wagle, Ankit
AU - Khatri, Sunil
AU - Vrudhula, Sarma
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - This paper presents Tulip, a new architecture for a binary neural network (BNN) that uses an optimal schedule for executing the operations of an arbitrary BNN. It was constructed with the goal of maximizing energy efficiency per classification. At the top-level, Tulip consists of a collection of unique processing elements (TULIP-PEs) that are organized in a SIMD fashion. Each Tulip- Peconsists of a small network of binary neurons, and a small amount of local memory per neuron. The unique aspect of the binary neuron is that it is implemented as a mixed-signal circuit that natively performs the inner-product and thresholding operation of an artificial binary neuron. Moreover, the binary neuron, which is implemented as a single CMOS standard cell, is reconfigurable, and with a change in a single parameter, can implement all standard operations involved in a BNN. We present novel algorithms for mapping arbitrary nodes of a BNN onto the TULIP-PEs. Tulip was implemented as an ASIC in TSMC 40nm-LP technology. To provide a fair comparison, a recently reported BNN that employs a conventional MAC-based arithmetic processor was also implemented in the same technology. The results show that Tulip is consistently 3X more energy-efficient than the conventional design, without any penalty in performance, area, or accuracy.
AB - This paper presents Tulip, a new architecture for a binary neural network (BNN) that uses an optimal schedule for executing the operations of an arbitrary BNN. It was constructed with the goal of maximizing energy efficiency per classification. At the top-level, Tulip consists of a collection of unique processing elements (TULIP-PEs) that are organized in a SIMD fashion. Each Tulip- Peconsists of a small network of binary neurons, and a small amount of local memory per neuron. The unique aspect of the binary neuron is that it is implemented as a mixed-signal circuit that natively performs the inner-product and thresholding operation of an artificial binary neuron. Moreover, the binary neuron, which is implemented as a single CMOS standard cell, is reconfigurable, and with a change in a single parameter, can implement all standard operations involved in a BNN. We present novel algorithms for mapping arbitrary nodes of a BNN onto the TULIP-PEs. Tulip was implemented as an ASIC in TSMC 40nm-LP technology. To provide a fair comparison, a recently reported BNN that employs a conventional MAC-based arithmetic processor was also implemented in the same technology. The results show that Tulip is consistently 3X more energy-efficient than the conventional design, without any penalty in performance, area, or accuracy.
KW - BNN
KW - Threshold logic
KW - area-efficient
KW - energy-efficient
KW - high-throughput
KW - highperformance
KW - reconfigurable
UR - http://www.scopus.com/inward/record.url?scp=85098883589&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098883589&partnerID=8YFLogxK
U2 - 10.1109/ICCD50377.2020.00079
DO - 10.1109/ICCD50377.2020.00079
M3 - Conference contribution
AN - SCOPUS:85098883589
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 433
EP - 440
BT - Proceedings - 2020 IEEE 38th International Conference on Computer Design, ICCD 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 38th IEEE International Conference on Computer Design, ICCD 2020
Y2 - 18 October 2020 through 21 October 2020
ER -