TY - GEN
T1 - Ctrl-C
T2 - 34th IEEE International Conference on Computer Design, ICCD 2016
AU - Lee, Shin Ying
AU - Wu, Carole-Jean
N1 - Funding Information:
The authors would like to thank Dr. Amrit Panda and the anonymous reviewers for their insightful feedback. This work is supported in part by the National Science Foundation (Grant #CCF-1618039) and by Science Foundation Arizona under the Bisgrove Early Career Scholarship. The opinions, findings and conclusions or recommendations expressed in this manuscript are those of the authors and do not necessarily reflect the views of the Science Foundation Arizona.
Publisher Copyright:
© 2016 IEEE.
PY - 2016/11/22
Y1 - 2016/11/22
N2 - The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.
AB - The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.
UR - http://www.scopus.com/inward/record.url?scp=85006826095&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85006826095&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2016.7753271
DO - 10.1109/ICCD.2016.7753271
M3 - Conference contribution
AN - SCOPUS:85006826095
T3 - Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016
SP - 133
EP - 140
BT - Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 October 2016 through 5 October 2016
ER -