Abstract
We present Versa, an energy-efficient 36-core systolic multiprocessor with dynamically reconfigurable interconnects and memory. Versa leverages reconfigurable functional units and systolic-enhanced ARM cores to adapt for different algorithm characteristics, providing optimized bandwidth, access latency, and data reuse. Hardware support for crucial thread-synchronization operations enables a tree-based algorithm with 6.5 times improvement in synchronization latency. Measured on a diverse set of compute kernels, Versa's design features culminate in median energy-efficiency improvements of 37.2 times and 11.6 times over mobile CPU and GPU baselines, respectively.
Original language | English (US) |
---|---|
Pages (from-to) | 986-998 |
Number of pages | 13 |
Journal | IEEE Journal of Solid-State Circuits |
Volume | 57 |
Issue number | 4 |
DOIs | |
State | Published - Apr 1 2022 |
Externally published | Yes |
Keywords
- Accelerators
- data movement
- data reuse
- energy efficiency
- interconnect
- multicore architecture
- on-chip memory
- programmability
- reconfiguration
- systolic arrays
ASJC Scopus subject areas
- Electrical and Electronic Engineering