Abstract
Recent advances in deep learning have shown that Binary Neural Network (BNN) is able to provide a satisfying accuracy on various image datasets with a significant reduction in computation and memory cost. With both weights and activations binarized to +1 or -1 in BNNs, the high-precision multiply-and-accumulate (MAC) operations can be replaced by XNOR and bit-counting operations. In this work, we present two computing-in-memory (CIM) architectures with parallelized weighted-sum operation for accelerating the inference of BNN: 1) parallel XNOR-SRAM, where a customized 8T-SRAM cell is used as a synapse; 2) parallel XNOR-RRAM, where a customized bit-cell consisting of 2T2R cells is used as a synapse. For large-scale weight matrices in neural networks, the array partition is necessary, where multi-level sense amplifiers (MLSAs) are employed as the intermediate interface for accumulating partial weighted sums. We explore various design options with different sub-array sizes and sensing bit-levels. Simulation results with 65nm CMOS PDK and RRAM models show that the system with 128×128 sub-array size and 3-bit MLSA can achieve 87.46% for an inspired VGG-like network on CIFAR-10 dataset, showing less than 1% degradation compared to the ideal software accuracy. The estimated energy-efficiency of XNOR-SRAM and XNOR-RRAM shows ~30× improvement compared to the corresponding conventional SRAM and RRAM architectures with sequential row-by-row read-out.
Original language | English (US) |
---|---|
Title of host publication | 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2018 - Proceedings |
Editors | Ting-Ao Tang, Fan Ye, Yu-Long Jiang |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781538644409 |
DOIs | |
State | Published - Dec 5 2018 |
Event | 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2018 - Qingdao, China Duration: Oct 31 2018 → Nov 3 2018 |
Other
Other | 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2018 |
---|---|
Country | China |
City | Qingdao |
Period | 10/31/18 → 11/3/18 |
Fingerprint
ASJC Scopus subject areas
- Electrical and Electronic Engineering
Cite this
Computing-in-Memory with SRAM and RRAM for Binary Neural Networks. / Sun, Xiaoyu; Liu, Rui; Peng, Xiaochen; Yu, Shimeng.
2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2018 - Proceedings. ed. / Ting-Ao Tang; Fan Ye; Yu-Long Jiang. Institute of Electrical and Electronics Engineers Inc., 2018. 8565811.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - Computing-in-Memory with SRAM and RRAM for Binary Neural Networks
AU - Sun, Xiaoyu
AU - Liu, Rui
AU - Peng, Xiaochen
AU - Yu, Shimeng
PY - 2018/12/5
Y1 - 2018/12/5
N2 - Recent advances in deep learning have shown that Binary Neural Network (BNN) is able to provide a satisfying accuracy on various image datasets with a significant reduction in computation and memory cost. With both weights and activations binarized to +1 or -1 in BNNs, the high-precision multiply-and-accumulate (MAC) operations can be replaced by XNOR and bit-counting operations. In this work, we present two computing-in-memory (CIM) architectures with parallelized weighted-sum operation for accelerating the inference of BNN: 1) parallel XNOR-SRAM, where a customized 8T-SRAM cell is used as a synapse; 2) parallel XNOR-RRAM, where a customized bit-cell consisting of 2T2R cells is used as a synapse. For large-scale weight matrices in neural networks, the array partition is necessary, where multi-level sense amplifiers (MLSAs) are employed as the intermediate interface for accumulating partial weighted sums. We explore various design options with different sub-array sizes and sensing bit-levels. Simulation results with 65nm CMOS PDK and RRAM models show that the system with 128×128 sub-array size and 3-bit MLSA can achieve 87.46% for an inspired VGG-like network on CIFAR-10 dataset, showing less than 1% degradation compared to the ideal software accuracy. The estimated energy-efficiency of XNOR-SRAM and XNOR-RRAM shows ~30× improvement compared to the corresponding conventional SRAM and RRAM architectures with sequential row-by-row read-out.
AB - Recent advances in deep learning have shown that Binary Neural Network (BNN) is able to provide a satisfying accuracy on various image datasets with a significant reduction in computation and memory cost. With both weights and activations binarized to +1 or -1 in BNNs, the high-precision multiply-and-accumulate (MAC) operations can be replaced by XNOR and bit-counting operations. In this work, we present two computing-in-memory (CIM) architectures with parallelized weighted-sum operation for accelerating the inference of BNN: 1) parallel XNOR-SRAM, where a customized 8T-SRAM cell is used as a synapse; 2) parallel XNOR-RRAM, where a customized bit-cell consisting of 2T2R cells is used as a synapse. For large-scale weight matrices in neural networks, the array partition is necessary, where multi-level sense amplifiers (MLSAs) are employed as the intermediate interface for accumulating partial weighted sums. We explore various design options with different sub-array sizes and sensing bit-levels. Simulation results with 65nm CMOS PDK and RRAM models show that the system with 128×128 sub-array size and 3-bit MLSA can achieve 87.46% for an inspired VGG-like network on CIFAR-10 dataset, showing less than 1% degradation compared to the ideal software accuracy. The estimated energy-efficiency of XNOR-SRAM and XNOR-RRAM shows ~30× improvement compared to the corresponding conventional SRAM and RRAM architectures with sequential row-by-row read-out.
UR - http://www.scopus.com/inward/record.url?scp=85060288289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060288289&partnerID=8YFLogxK
U2 - 10.1109/ICSICT.2018.8565811
DO - 10.1109/ICSICT.2018.8565811
M3 - Conference contribution
AN - SCOPUS:85060288289
BT - 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2018 - Proceedings
A2 - Tang, Ting-Ao
A2 - Ye, Fan
A2 - Jiang, Yu-Long
PB - Institute of Electrical and Electronics Engineers Inc.
ER -