We present a convolutional neural network (CNN) learning processor, which accelerates the stochastic gradient descent (SGD) with a momentum-based training algorithm in 16-bit fixed-point precision. Using a new cyclic weight storage and access scheme, we use the same off-the-shelf SRAMs for nontranspose and transpose operations during feedforward (FF) and feedbackward (FB) phases, respectively, of the CNN learning process. The 65-nm CNN learning processor achieves peak energy efficiency of 2.6 TOPS/W for 16-bit fixed-point operations, consuming 10.45 mW at 0.55 V.
- Convolutional neural networks (CNNs)
- dual-read-mode weight storage
- on-chip learning
- stochastic gradient descent (SGD)
ASJC Scopus subject areas
- Electrical and Electronic Engineering