RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

Liu Ke, Udit Gupta, Benjamin Youngjae Cho, David Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim Hazelwood, Bill Jia, Hsien Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen, Carole Jean Wu, Mark HempsteadXuan Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

103 Scopus citations

Abstract

Personalized recommendation systems leverage deep learning models and account for the majority of data center AI cycles. Their performance is dominated by memory-bound sparse embedding operations with unique irregular memory access patterns that pose a fundamental challenge to accelerate. This paper proposes a lightweight, commodity DRAM compliant, near-memory processing solution to accelerate personalized recommendation inference. The in-depth characterization of production-grade recommendation models shows that embedding operations with high model-, operator- and data-level parallelism lead to memory bandwidth saturation, limiting recommendation inference performance. We propose RecNMP which provides a scalable solution to improve system throughput, supporting a broad range of sparse embedding models. RecNMP is specifically tailored to production environments with heavy co-location of operators on a single server. Several hardware/software co-optimization techniques such as memory-side caching, table-aware packet scheduling, and hot entry profiling are studied, providing up to $9.8 × memory latency speedup over a highly-optimized baseline. Overall, RecNMP offers $4.2 × throughput improvement and 45.8% memory energy savings.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages790-803
Number of pages14
ISBN (Electronic)9781728146614
DOIs
StatePublished - May 2020
Externally publishedYes
Event47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020 - Virtual, Online, Spain
Duration: May 30 2020Jun 3 2020

Publication series

NameProceedings - International Symposium on Computer Architecture
Volume2020-May
ISSN (Print)1063-6897

Conference

Conference47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020
Country/TerritorySpain
CityVirtual, Online
Period5/30/206/3/20

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing'. Together they form a unique fingerprint.

Cite this