Machine learning at facebook

Understanding inference at the edge

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang & 6 others Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, Peizhao Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.

Original languageEnglish (US)
Title of host publicationProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages331-344
Number of pages14
ISBN (Electronic)9781728114446
DOIs
StatePublished - Mar 26 2019
Externally publishedYes
Event25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 - Washington, United States
Duration: Feb 16 2019Feb 20 2019

Publication series

NameProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019

Conference

Conference25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
CountryUnited States
CityWashington
Period2/16/192/20/19

Fingerprint

Learning systems
Augmented reality
Smartphones
Virtual reality

Keywords

  • Edge Inference
  • Machine learning

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Wu, C-J., Brooks, D., Chen, K., Chen, D., Choudhury, S., Dukhan, M., ... Zhang, P. (2019). Machine learning at facebook: Understanding inference at the edge. In Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 (pp. 331-344). [8675201] (Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HPCA.2019.00048

Machine learning at facebook : Understanding inference at the edge. / Wu, Carole-Jean; Brooks, David; Chen, Kevin; Chen, Douglas; Choudhury, Sy; Dukhan, Marat; Hazelwood, Kim; Isaac, Eldad; Jia, Yangqing; Jia, Bill; Leyvand, Tommer; Lu, Hao; Lu, Yang; Qiao, Lin; Reagen, Brandon; Spisak, Joe; Sun, Fei; Tulloch, Andrew; Vajda, Peter; Wang, Xiaodong; Wang, Yanghan; Wasti, Bram; Wu, Yiming; Xian, Ran; Yoo, Sungjoo; Zhang, Peizhao.

Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 331-344 8675201 (Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wu, C-J, Brooks, D, Chen, K, Chen, D, Choudhury, S, Dukhan, M, Hazelwood, K, Isaac, E, Jia, Y, Jia, B, Leyvand, T, Lu, H, Lu, Y, Qiao, L, Reagen, B, Spisak, J, Sun, F, Tulloch, A, Vajda, P, Wang, X, Wang, Y, Wasti, B, Wu, Y, Xian, R, Yoo, S & Zhang, P 2019, Machine learning at facebook: Understanding inference at the edge. in Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019., 8675201, Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019, Institute of Electrical and Electronics Engineers Inc., pp. 331-344, 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019, Washington, United States, 2/16/19. https://doi.org/10.1109/HPCA.2019.00048
Wu C-J, Brooks D, Chen K, Chen D, Choudhury S, Dukhan M et al. Machine learning at facebook: Understanding inference at the edge. In Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 331-344. 8675201. (Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019). https://doi.org/10.1109/HPCA.2019.00048
Wu, Carole-Jean ; Brooks, David ; Chen, Kevin ; Chen, Douglas ; Choudhury, Sy ; Dukhan, Marat ; Hazelwood, Kim ; Isaac, Eldad ; Jia, Yangqing ; Jia, Bill ; Leyvand, Tommer ; Lu, Hao ; Lu, Yang ; Qiao, Lin ; Reagen, Brandon ; Spisak, Joe ; Sun, Fei ; Tulloch, Andrew ; Vajda, Peter ; Wang, Xiaodong ; Wang, Yanghan ; Wasti, Bram ; Wu, Yiming ; Xian, Ran ; Yoo, Sungjoo ; Zhang, Peizhao. / Machine learning at facebook : Understanding inference at the edge. Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 331-344 (Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019).
@inproceedings{129caeecbab449ef854c1257973d9f75,
title = "Machine learning at facebook: Understanding inference at the edge",
abstract = "At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.",
keywords = "Edge Inference, Machine learning",
author = "Carole-Jean Wu and David Brooks and Kevin Chen and Douglas Chen and Sy Choudhury and Marat Dukhan and Kim Hazelwood and Eldad Isaac and Yangqing Jia and Bill Jia and Tommer Leyvand and Hao Lu and Yang Lu and Lin Qiao and Brandon Reagen and Joe Spisak and Fei Sun and Andrew Tulloch and Peter Vajda and Xiaodong Wang and Yanghan Wang and Bram Wasti and Yiming Wu and Ran Xian and Sungjoo Yoo and Peizhao Zhang",
year = "2019",
month = "3",
day = "26",
doi = "10.1109/HPCA.2019.00048",
language = "English (US)",
series = "Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "331--344",
booktitle = "Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019",

}

TY - GEN

T1 - Machine learning at facebook

T2 - Understanding inference at the edge

AU - Wu, Carole-Jean

AU - Brooks, David

AU - Chen, Kevin

AU - Chen, Douglas

AU - Choudhury, Sy

AU - Dukhan, Marat

AU - Hazelwood, Kim

AU - Isaac, Eldad

AU - Jia, Yangqing

AU - Jia, Bill

AU - Leyvand, Tommer

AU - Lu, Hao

AU - Lu, Yang

AU - Qiao, Lin

AU - Reagen, Brandon

AU - Spisak, Joe

AU - Sun, Fei

AU - Tulloch, Andrew

AU - Vajda, Peter

AU - Wang, Xiaodong

AU - Wang, Yanghan

AU - Wasti, Bram

AU - Wu, Yiming

AU - Xian, Ran

AU - Yoo, Sungjoo

AU - Zhang, Peizhao

PY - 2019/3/26

Y1 - 2019/3/26

N2 - At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.

AB - At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.

KW - Edge Inference

KW - Machine learning

UR - http://www.scopus.com/inward/record.url?scp=85064189318&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064189318&partnerID=8YFLogxK

U2 - 10.1109/HPCA.2019.00048

DO - 10.1109/HPCA.2019.00048

M3 - Conference contribution

T3 - Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019

SP - 331

EP - 344

BT - Proceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -