MCM-GPU: Multi-chip-module GPUs for continued performance scalability

Akhil Arunkumar, Evgeny Bolotin, Benjamin Cho, Ugljesa Milic, Eiman Ebrahimi, Oreste Villa, Aamer Jaleel, Carole-Jean Wu, David Nellans

Research output: Chapter in Book/Report/Conference proceedingConference contribution

47 Scopus citations

Abstract

Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single monolithic GPUs will ultimately plateau. However, the need for higher performing GPUs continues to exist in many domains. To address this need, in this paper we demonstrate that package-level integration of multiple GPU modules to build larger logical GPUs can enable continuous performance scaling beyond Moore's law. Specifcally, we propose partitioning GPUs into easily manufacturable basic GPU Modules (GPMs), and integrating them on package using high bandwidth and power effcient signaling technologies. We lay out the details and evaluate the feasibility of a basic Multi-Chip-Module GPU (MCMGPU) design. We then propose three architectural optimizations that signifcantly improve GPM data locality and minimize the sensitivity on inter-GPM bandwidth. Our evaluation shows that the optimized MCM-GPU achieves 22.8% speedup and 5x inter-GPM bandwidth reduction when compared to the basic MCM-GPU architecture. Most importantly, the optimized MCM-GPU design is 45.5% faster than the largest implementable monolithic GPU, and performs within 10% of a hypothetical (and unbuildable) monolithic GPU. Lastly we show that our optimized MCM-GPU is 26.8% faster than an equally equipped Multi-GPU system with the same total number of SMs and DRAM bandwidth.

Original languageEnglish (US)
Title of host publicationISCA 2017 - 44th Annual International Symposium on Computer Architecture - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages320-332
Number of pages13
VolumePart F128643
ISBN (Electronic)9781450348928
DOIs
StatePublished - Jun 24 2017
Externally publishedYes
Event44th Annual International Symposium on Computer Architecture - ISCA 2017 - Toronto, Canada
Duration: Jun 24 2017Jun 28 2017

Other

Other44th Annual International Symposium on Computer Architecture - ISCA 2017
CountryCanada
CityToronto
Period6/24/176/28/17

Keywords

  • Graphics Processing Units
  • Moore's Law
  • Multi-Chip-Modules
  • NUMA Systems

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint Dive into the research topics of 'MCM-GPU: Multi-chip-module GPUs for continued performance scalability'. Together they form a unique fingerprint.

  • Cite this

    Arunkumar, A., Bolotin, E., Cho, B., Milic, U., Ebrahimi, E., Villa, O., Jaleel, A., Wu, C-J., & Nellans, D. (2017). MCM-GPU: Multi-chip-module GPUs for continued performance scalability. In ISCA 2017 - 44th Annual International Symposium on Computer Architecture - Conference Proceedings (Vol. Part F128643, pp. 320-332). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/3079856.3080231