The architectural implications of facebook's DNN-based personalized recommendation

Udit Gupta, Carole Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Mark Hempstead, Bill Jia, Hsien Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

142 Scopus citations

Abstract

The widespread application of deep learning has changed the landscape of computation in data centers. In particular, personalized recommendation for content ranking is now largely accomplished using deep neural networks. However, despite their importance and the amount of compute cycles they consume, relatively little research attention has been devoted to recommendation systems. To facilitate research and advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inference jobs can drastically improve latency-bounded throughput, and diversity across recommendation models leads to different optimization strategies.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages488-501
Number of pages14
ISBN (Electronic)9781728161495
DOIs
StatePublished - Feb 2020
Externally publishedYes
Event26th IEEE International Symposium on High Performance Computer Architecture, HPCA 2020 - San Diego, United States
Duration: Feb 22 2020Feb 26 2020

Publication series

NameProceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020

Conference

Conference26th IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
Country/TerritoryUnited States
CitySan Diego
Period2/22/202/26/20

ASJC Scopus subject areas

  • Artificial Intelligence
  • Hardware and Architecture
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'The architectural implications of facebook's DNN-based personalized recommendation'. Together they form a unique fingerprint.

Cite this