TY - GEN
T1 - Fully parallel RRAM synaptic array for implementing binary neural network with (+1, -1) weights and (+1, 0) neurons
AU - Sun, Xiaoyu
AU - Peng, Xiaochen
AU - Chen, Pai Yu
AU - Liu, Rui
AU - Seo, Jae-sun
AU - Yu, Shimeng
N1 - Funding Information:
This work is in part supported by NSF-CCF-1552687, NSF-CCF-1740225, and grants from Qualcomm.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/2/20
Y1 - 2018/2/20
N2 - Binary Neural Networks (BNNs) have been recently proposed to improve the area-/energy-efficiency of the machine/deep learning hardware accelerators, which opens an opportunity to use the technologically more mature binary RRAM devices to effectively implement the binary synaptic weights. In addition, the binary neuron activation enables using the sense amplifier instead of the analog-to-digital converter to allow bitwise communication between layers of the neural networks. However, the sense amplifier has intrinsic offset that affects the threshold of binary neuron, thus it may degrade the classification accuracy. In this work, we analyze a fully parallel RRAM synaptic array architecture that implements the fully connected layers in a convolutional neural network with (+1, -1) weights and (+1, 0) neurons. The simulation results with TSMC 65 nm PDK show that the offset of current mode sense amplifier introduces a slight accuracy loss from ∼98.5% to ∼97.6% for MNIST dataset. Nevertheless, the proposed fully parallel BNN architecture (P-BNN) can achieve 137.35 TOPS/W energy efficiency for the inference, improved by ∼20X compared to the sequential BNN architecture (S-BNN) with row-by-row read-out scheme. Moreover, the proposed P-BNN architecture can save the chip area by ∼16% as it eliminates the area overhead of MAC peripheral units in the S-BNN architecture.
AB - Binary Neural Networks (BNNs) have been recently proposed to improve the area-/energy-efficiency of the machine/deep learning hardware accelerators, which opens an opportunity to use the technologically more mature binary RRAM devices to effectively implement the binary synaptic weights. In addition, the binary neuron activation enables using the sense amplifier instead of the analog-to-digital converter to allow bitwise communication between layers of the neural networks. However, the sense amplifier has intrinsic offset that affects the threshold of binary neuron, thus it may degrade the classification accuracy. In this work, we analyze a fully parallel RRAM synaptic array architecture that implements the fully connected layers in a convolutional neural network with (+1, -1) weights and (+1, 0) neurons. The simulation results with TSMC 65 nm PDK show that the offset of current mode sense amplifier introduces a slight accuracy loss from ∼98.5% to ∼97.6% for MNIST dataset. Nevertheless, the proposed fully parallel BNN architecture (P-BNN) can achieve 137.35 TOPS/W energy efficiency for the inference, improved by ∼20X compared to the sequential BNN architecture (S-BNN) with row-by-row read-out scheme. Moreover, the proposed P-BNN architecture can save the chip area by ∼16% as it eliminates the area overhead of MAC peripheral units in the S-BNN architecture.
UR - http://www.scopus.com/inward/record.url?scp=85045323481&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045323481&partnerID=8YFLogxK
U2 - 10.1109/ASPDAC.2018.8297384
DO - 10.1109/ASPDAC.2018.8297384
M3 - Conference contribution
AN - SCOPUS:85045323481
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 574
EP - 579
BT - ASP-DAC 2018 - 23rd Asia and South Pacific Design Automation Conference, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd Asia and South Pacific Design Automation Conference, ASP-DAC 2018
Y2 - 22 January 2018 through 25 January 2018
ER -