MLPerf Inference Benchmark

Vijay Janapa Reddi; Christine Cheng; David Kanter; Peter Mattson; Guenther Schmuelling; Carole Jean Wu; Brian Anderson; Maximilien Breughe; Mark Charlebois; William Chou; Ramesh Chukka; Cody Coleman; Sam Davis; Pan Deng; Greg Diamos; Jared Duke; Dave Fick; J. Scott Gardner; Itay Hubara; Sachin Idgunji; Thomas B. Jablin; Jeff Jiao; Tom St. John; Pankaj Kanwar; David Lee; Jeffery Liao; Anton Lokhmotov; Francisco Massa; Peng Meng; Paulius Micikevicius; Colin Osborne; Gennady Pekhimenko; Arun Tejusve Raghunath Rajan; Dilip Sequeira; Ashish Sirasao; Fei Sun; Hanlin Tang; Michael Thomson; Frank Wei; Ephrem Wu; Lingjie Xu; Koichi Yamada; Bing Yu; George Yuan; Aaron Zhong; Peizhao Zhang; Yuchen Zhou

doi:10.1109/ISCA45697.2020.00045

MLPerf Inference Benchmark

Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin IdgunjiThomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

213 Scopus citations

Abstract

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.

Original language	English (US)
Title of host publication	Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	446-459
Number of pages	14
ISBN (Electronic)	9781728146614
DOIs	https://doi.org/10.1109/ISCA45697.2020.00045
State	Published - May 2020
Externally published	Yes
Event	47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020 - Virtual, Online, Spain Duration: May 30 2020 → Jun 3 2020

Publication series

Name	Proceedings - International Symposium on Computer Architecture
Volume	2020-May
ISSN (Print)	1063-6897

Conference

Conference	47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020
Country/Territory	Spain
City	Virtual, Online
Period	5/30/20 → 6/3/20

Keywords

Benchmarking
Inference
Machine Learning

ASJC Scopus subject areas

Hardware and Architecture

Access to Document

10.1109/ISCA45697.2020.00045

Cite this

Reddi, V. J., Cheng, C., Kanter, D., Mattson, P., Schmuelling, G., Wu, C. J., Anderson, B., Breughe, M., Charlebois, M., Chou, W., Chukka, R., Coleman, C., Davis, S., Deng, P., Diamos, G., Duke, J., Fick, D., Gardner, J. S., Hubara, I., ... Zhou, Y. (2020). MLPerf Inference Benchmark. In Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020 (pp. 446-459). Article 9138989 (Proceedings - International Symposium on Computer Architecture; Vol. 2020-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISCA45697.2020.00045

MLPerf Inference Benchmark. / Reddi, Vijay Janapa; Cheng, Christine; Kanter, David et al.
Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020. Institute of Electrical and Electronics Engineers Inc., 2020. p. 446-459 9138989 (Proceedings - International Symposium on Computer Architecture; Vol. 2020-May).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Reddi, VJ, Cheng, C, Kanter, D, Mattson, P, Schmuelling, G, Wu, CJ, Anderson, B, Breughe, M, Charlebois, M, Chou, W, Chukka, R, Coleman, C, Davis, S, Deng, P, Diamos, G, Duke, J, Fick, D, Gardner, JS, Hubara, I, Idgunji, S, Jablin, TB, Jiao, J, St. John, T, Kanwar, P, Lee, D, Liao, J, Lokhmotov, A, Massa, F, Meng, P, Micikevicius, P, Osborne, C, Pekhimenko, G, Rajan, ATR, Sequeira, D, Sirasao, A, Sun, F, Tang, H, Thomson, M, Wei, F, Wu, E, Xu, L, Yamada, K, Yu, B, Yuan, G, Zhong, A, Zhang, P & Zhou, Y 2020, MLPerf Inference Benchmark. in Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020., 9138989, Proceedings - International Symposium on Computer Architecture, vol. 2020-May, Institute of Electrical and Electronics Engineers Inc., pp. 446-459, 47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020, Virtual, Online, Spain, 5/30/20. https://doi.org/10.1109/ISCA45697.2020.00045

Reddi VJ, Cheng C, Kanter D, Mattson P, Schmuelling G, Wu CJ et al. MLPerf Inference Benchmark. In Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020. Institute of Electrical and Electronics Engineers Inc. 2020. p. 446-459. 9138989. (Proceedings - International Symposium on Computer Architecture). doi: 10.1109/ISCA45697.2020.00045

@inproceedings{5adcd56016404bd786b5d4b26ef4656e,

title = "MLPerf Inference Benchmark",

abstract = "Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.",

keywords = "Benchmarking, Inference, Machine Learning",

author = "Reddi, {Vijay Janapa} and Christine Cheng and David Kanter and Peter Mattson and Guenther Schmuelling and Wu, {Carole Jean} and Brian Anderson and Maximilien Breughe and Mark Charlebois and William Chou and Ramesh Chukka and Cody Coleman and Sam Davis and Pan Deng and Greg Diamos and Jared Duke and Dave Fick and Gardner, {J. Scott} and Itay Hubara and Sachin Idgunji and Jablin, {Thomas B.} and Jeff Jiao and {St. John}, Tom and Pankaj Kanwar and David Lee and Jeffery Liao and Anton Lokhmotov and Francisco Massa and Peng Meng and Paulius Micikevicius and Colin Osborne and Gennady Pekhimenko and Rajan, {Arun Tejusve Raghunath} and Dilip Sequeira and Ashish Sirasao and Fei Sun and Hanlin Tang and Michael Thomson and Frank Wei and Ephrem Wu and Lingjie Xu and Koichi Yamada and Bing Yu and George Yuan and Aaron Zhong and Peizhao Zhang and Yuchen Zhou",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020 ; Conference date: 30-05-2020 Through 03-06-2020",

year = "2020",

month = may,

doi = "10.1109/ISCA45697.2020.00045",

language = "English (US)",

series = "Proceedings - International Symposium on Computer Architecture",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "446--459",

booktitle = "Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020",

}

TY - GEN

T1 - MLPerf Inference Benchmark

AU - Reddi, Vijay Janapa

AU - Cheng, Christine

AU - Kanter, David

AU - Mattson, Peter

AU - Schmuelling, Guenther

AU - Wu, Carole Jean

AU - Anderson, Brian

AU - Breughe, Maximilien

AU - Charlebois, Mark

AU - Chou, William

AU - Chukka, Ramesh

AU - Coleman, Cody

AU - Davis, Sam

AU - Deng, Pan

AU - Diamos, Greg

AU - Duke, Jared

AU - Fick, Dave

AU - Gardner, J. Scott

AU - Hubara, Itay

AU - Idgunji, Sachin

AU - Jablin, Thomas B.

AU - Jiao, Jeff

AU - St. John, Tom

AU - Kanwar, Pankaj

AU - Lee, David

AU - Liao, Jeffery

AU - Lokhmotov, Anton

AU - Massa, Francisco

AU - Meng, Peng

AU - Micikevicius, Paulius

AU - Osborne, Colin

AU - Pekhimenko, Gennady

AU - Rajan, Arun Tejusve Raghunath

AU - Sequeira, Dilip

AU - Sirasao, Ashish

AU - Sun, Fei

AU - Tang, Hanlin

AU - Thomson, Michael

AU - Wei, Frank

AU - Wu, Ephrem

AU - Xu, Lingjie

AU - Yamada, Koichi

AU - Yu, Bing

AU - Yuan, George

AU - Zhong, Aaron

AU - Zhang, Peizhao

AU - Zhou, Yuchen

PY - 2020/5

Y1 - 2020/5

N2 - Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.

AB - Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.

KW - Benchmarking

KW - Inference

KW - Machine Learning

UR - http://www.scopus.com/inward/record.url?scp=85092007781&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85092007781&partnerID=8YFLogxK

U2 - 10.1109/ISCA45697.2020.00045

DO - 10.1109/ISCA45697.2020.00045

M3 - Conference contribution

AN - SCOPUS:85092007781

T3 - Proceedings - International Symposium on Computer Architecture

SP - 446

EP - 459

BT - Proceedings - 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA 2020

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 47th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2020

Y2 - 30 May 2020 through 3 June 2020

ER -

MLPerf Inference Benchmark

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this