TY - GEN
T1 - Spatial-temporal Data Compression of Dynamic Vision Sensor Output with High Pixel-level Saliency using Low-precision Sparse Autoencoder
AU - Hasssan, Ahmed
AU - Meng, Jian
AU - Cao, Yu
AU - Seo, Jae Sun
N1 - Funding Information:
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001121C0134. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA).
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Imaging innovations such as dynamic vision sensor (DVS) can significantly reduce the image data volume by tracking only the changes in events. However, when DVS camera itself moves around (e.g. self-driving cars), the DVS output stream is not sparse enough to achieve the desired hardware efficiency. In this work, we investigate designing a compact sparse auto encoder model to largely compress event-based DVS output. The proposed encoder-decoder-based autoencoder design is a shallow convolutional neural network (CNN) architecture with two convolution and inverse-convolution layers with only ~ 10k parameters. We implement quantization-aware training on our proposed model to achieve 2-bit and 4-bit precision. Moreover, we implement unstructured pruning on the encoder module to achieve >90 % active pixel compression at the latent space. The proposed autoencoder design has been validated against multiple benchmark DVS-based datasets including DVS-MNIST, N-Cars, DVS-IBM Gesture, and Prophesee Automotive Gen1 dataset. We achieve low accuracy drop of 2%, 3%, and 3.8% compared to the uncompressed baseline, with 7.08%, 1.36%, and 5.53 % active pixels in the images from the decoder (compression ratio of 13.1×, 29.1×, and 18.1×) for DVS-MNIST, N-Cars, and DVS-IBM Gesture datasets, respectively. For the Prophesee Automotive Gen1 dataset, we achieve a minimal mAP drop of 0.07 from the baseline with 9% active pixels in the images from the decoder (compression ratio of 11.9×).
AB - Imaging innovations such as dynamic vision sensor (DVS) can significantly reduce the image data volume by tracking only the changes in events. However, when DVS camera itself moves around (e.g. self-driving cars), the DVS output stream is not sparse enough to achieve the desired hardware efficiency. In this work, we investigate designing a compact sparse auto encoder model to largely compress event-based DVS output. The proposed encoder-decoder-based autoencoder design is a shallow convolutional neural network (CNN) architecture with two convolution and inverse-convolution layers with only ~ 10k parameters. We implement quantization-aware training on our proposed model to achieve 2-bit and 4-bit precision. Moreover, we implement unstructured pruning on the encoder module to achieve >90 % active pixel compression at the latent space. The proposed autoencoder design has been validated against multiple benchmark DVS-based datasets including DVS-MNIST, N-Cars, DVS-IBM Gesture, and Prophesee Automotive Gen1 dataset. We achieve low accuracy drop of 2%, 3%, and 3.8% compared to the uncompressed baseline, with 7.08%, 1.36%, and 5.53 % active pixels in the images from the decoder (compression ratio of 13.1×, 29.1×, and 18.1×) for DVS-MNIST, N-Cars, and DVS-IBM Gesture datasets, respectively. For the Prophesee Automotive Gen1 dataset, we achieve a minimal mAP drop of 0.07 from the baseline with 9% active pixels in the images from the decoder (compression ratio of 11.9×).
KW - detection
KW - Dynamic vision sensor
KW - Neural network pruning
KW - Neural network training
KW - Object recognition
KW - Quantization
KW - Sparse autoencoder
UR - http://www.scopus.com/inward/record.url?scp=85150197099&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85150197099&partnerID=8YFLogxK
U2 - 10.1109/IEEECONF56349.2022.10051946
DO - 10.1109/IEEECONF56349.2022.10051946
M3 - Conference contribution
AN - SCOPUS:85150197099
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 344
EP - 348
BT - 56th Asilomar Conference on Signals, Systems and Computers, ACSSC 2022
A2 - Matthews, Michael B.
PB - IEEE Computer Society
T2 - 56th Asilomar Conference on Signals, Systems and Computers, ACSSC 2022
Y2 - 31 October 2022 through 2 November 2022
ER -