TY - GEN
T1 - ASP vision
T2 - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
AU - Chen, Huaijin G.
AU - Jayasuriya, Suren
AU - Yang, Jiyue
AU - Stephen, Judy
AU - Sivaramakrishnan, Sriram
AU - Veeraraghavan, Ashok
AU - Molnar, Alyosha
PY - 2016/12/9
Y1 - 2016/12/9
N2 - Deep learning using convolutional neural networks (CNNs) is quickly becoming the state-of-the-art for challenging computer vision applications. However, deep learning's power consumption and bandwidth requirements currently limit its application in embedded and mobile systems with tight energy budgets. In this paper, we explore the energy savings of optically computing the first layer of CNNs. To do so, we utilize bio-inspired Angle Sensitive Pixels (ASPs), custom CMOS diffractive image sensors which act similar to Gabor filter banks in the V1 layer of the human visual cortex. ASPs replace both image sensing and the first layer of a conventional CNN by directly performing optical edge filtering, saving sensing energy, data bandwidth, and CNN FLOPS to compute. Our experimental results (both on synthetic data and a hardware prototype) for a variety of vision tasks such as digit recognition, object recognition, and face identification demonstrate 97% reduction in image sensor power consumption and 90% reduction in data bandwidth from sensor to CPU, while achieving similar performance compared to traditional deep learning pipelines.
AB - Deep learning using convolutional neural networks (CNNs) is quickly becoming the state-of-the-art for challenging computer vision applications. However, deep learning's power consumption and bandwidth requirements currently limit its application in embedded and mobile systems with tight energy budgets. In this paper, we explore the energy savings of optically computing the first layer of CNNs. To do so, we utilize bio-inspired Angle Sensitive Pixels (ASPs), custom CMOS diffractive image sensors which act similar to Gabor filter banks in the V1 layer of the human visual cortex. ASPs replace both image sensing and the first layer of a conventional CNN by directly performing optical edge filtering, saving sensing energy, data bandwidth, and CNN FLOPS to compute. Our experimental results (both on synthetic data and a hardware prototype) for a variety of vision tasks such as digit recognition, object recognition, and face identification demonstrate 97% reduction in image sensor power consumption and 90% reduction in data bandwidth from sensor to CPU, while achieving similar performance compared to traditional deep learning pipelines.
UR - http://www.scopus.com/inward/record.url?scp=84986317236&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84986317236&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2016.104
DO - 10.1109/CVPR.2016.104
M3 - Conference contribution
AN - SCOPUS:84986317236
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 903
EP - 912
BT - Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
PB - IEEE Computer Society
Y2 - 26 June 2016 through 1 July 2016
ER -