TY - GEN
T1 - A fixed-point neural network for keyword detection on resource constrained hardware
AU - Shah, Mohit
AU - Wang, Jingcheng
AU - Blaauw, David
AU - Sylvester, Dennis
AU - Kim, Hun Seok
AU - Chakrabarti, Chaitali
PY - 2015/12/2
Y1 - 2015/12/2
N2 - Keyword detection is typically used as a front-end to trigger automatic speech recognition and spoken dialog systems. The detection engine needs to be continuously listening, which has strong implications on power and memory consumption. In this paper, we devise a neural network architecture for keyword detection and present a set of techniques for reducing the memory requirements in order to make the architecture suitable for resource constrained hardware. Specifically, a fixed-point implementation is considered; aggressively scaling down the precision of the weights lowers the memory compared to a naive floating-point implementation. For further optimization, a node pruning technique is proposed to identify and remove the least active nodes in a neural network. Experiments are conducted over 10 keywords selected from the Resource Management (RM) database. The trade-off between detection performance and memory is assessed for different weight representations. We show that a neural network with as few as 5 bits per weight yields a marginal and acceptable loss in performance, while requiring only 200 kilobytes (KB) of on-board memory and a latency of 150 ms. A hardware architecture using a single multiplier and a power consumption of less than 10mW is also presented.
AB - Keyword detection is typically used as a front-end to trigger automatic speech recognition and spoken dialog systems. The detection engine needs to be continuously listening, which has strong implications on power and memory consumption. In this paper, we devise a neural network architecture for keyword detection and present a set of techniques for reducing the memory requirements in order to make the architecture suitable for resource constrained hardware. Specifically, a fixed-point implementation is considered; aggressively scaling down the precision of the weights lowers the memory compared to a naive floating-point implementation. For further optimization, a node pruning technique is proposed to identify and remove the least active nodes in a neural network. Experiments are conducted over 10 keywords selected from the Resource Management (RM) database. The trade-off between detection performance and memory is assessed for different weight representations. We show that a neural network with as few as 5 bits per weight yields a marginal and acceptable loss in performance, while requiring only 200 kilobytes (KB) of on-board memory and a latency of 150 ms. A hardware architecture using a single multiplier and a power consumption of less than 10mW is also presented.
UR - http://www.scopus.com/inward/record.url?scp=84958211931&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84958211931&partnerID=8YFLogxK
U2 - 10.1109/SiPS.2015.7345026
DO - 10.1109/SiPS.2015.7345026
M3 - Conference contribution
AN - SCOPUS:84958211931
T3 - IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
BT - Electronic Proceedings of the 2015 IEEE International Workshop on Signal Processing Systems, SiPS 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Workshop on Signal Processing Systems, SiPS 2015
Y2 - 14 October 2015 through 16 October 2015
ER -