Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs

Shin Ying Lee; Carole-Jean Wu

doi:10.1109/ICCD.2016.7753271

Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs

Shin Ying Lee, Carole-Jean Wu

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

15 Scopus citations

Abstract

The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.

Original language	English (US)
Title of host publication	Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	133-140
Number of pages	8
ISBN (Electronic)	9781509051427
DOIs	https://doi.org/10.1109/ICCD.2016.7753271
State	Published - Nov 22 2016
Event	34th IEEE International Conference on Computer Design, ICCD 2016 - Scottsdale, United States Duration: Oct 2 2016 → Oct 5 2016

Publication series

Name	Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016

Other

Other	34th IEEE International Conference on Computer Design, ICCD 2016
Country/Territory	United States
City	Scottsdale
Period	10/2/16 → 10/5/16

ASJC Scopus subject areas

Hardware and Architecture

Access to Document

10.1109/ICCD.2016.7753271

Cite this

Lee, S. Y., & Wu, C.-J. (2016). Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs. In Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016 (pp. 133-140). Article 7753271 (Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCD.2016.7753271

Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs. / Lee, Shin Ying; Wu, Carole-Jean.
Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 133-140 7753271 (Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Lee, SY & Wu, C-J 2016, Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs. in Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016., 7753271, Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016, Institute of Electrical and Electronics Engineers Inc., pp. 133-140, 34th IEEE International Conference on Computer Design, ICCD 2016, Scottsdale, United States, 10/2/16. https://doi.org/10.1109/ICCD.2016.7753271

Lee SY, Wu CJ. Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs. In Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 133-140. 7753271. (Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016). doi: 10.1109/ICCD.2016.7753271

@inproceedings{3b2929e0c4be46c2b3bd5761ff9b3990,

title = "Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs",

abstract = "The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.",

author = "Lee, {Shin Ying} and Carole-Jean Wu",

note = "Funding Information: The authors would like to thank Dr. Amrit Panda and the anonymous reviewers for their insightful feedback. This work is supported in part by the National Science Foundation (Grant #CCF-1618039) and by Science Foundation Arizona under the Bisgrove Early Career Scholarship. The opinions, findings and conclusions or recommendations expressed in this manuscript are those of the authors and do not necessarily reflect the views of the Science Foundation Arizona. Publisher Copyright: {\textcopyright} 2016 IEEE.; 34th IEEE International Conference on Computer Design, ICCD 2016 ; Conference date: 02-10-2016 Through 05-10-2016",

year = "2016",

month = nov,

day = "22",

doi = "10.1109/ICCD.2016.7753271",

language = "English (US)",

series = "Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "133--140",

booktitle = "Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016",

}

TY - GEN

T1 - Ctrl-C

T2 - 34th IEEE International Conference on Computer Design, ICCD 2016

AU - Lee, Shin Ying

AU - Wu, Carole-Jean

N1 - Funding Information: The authors would like to thank Dr. Amrit Panda and the anonymous reviewers for their insightful feedback. This work is supported in part by the National Science Foundation (Grant #CCF-1618039) and by Science Foundation Arizona under the Bisgrove Early Career Scholarship. The opinions, findings and conclusions or recommendations expressed in this manuscript are those of the authors and do not necessarily reflect the views of the Science Foundation Arizona. Publisher Copyright: © 2016 IEEE.

PY - 2016/11/22

Y1 - 2016/11/22

N2 - The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.

AB - The performance of general-purpose graphics processing units (GPGPUs) is often limited by the efficiency of the memory subsystems, particularly the L1 data caches. Because of the massive multithreading computation paradigm, significant memory resource contention and cache thrashing are often observed in GPGPU workloads. This leads to high cache miss rates and substantial pipeline stall time. In order to improve the efficiency of GPU caches, we propose an instruction-aware control loop based adaptive cache bypassing design (Ctrl-C). Ctrl-C applies an instruction-aware algorithm to dynamically identify per-memory instruction cache reuse behavior. Ctrl-C then adopts feedback control loops to bypass memory requests probabilistically in order to protect cache lines with short reuse distances from early eviction. GPGPU-sim simulation based evaluation shows that Ctrl-C improves the performance of cache sensitive GPGPU workloads by 41.5%, leading to higher cache and interconnect bandwidth utilization with only an insignificant 3.5% area overhead.

UR - http://www.scopus.com/inward/record.url?scp=85006826095&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006826095&partnerID=8YFLogxK

U2 - 10.1109/ICCD.2016.7753271

DO - 10.1109/ICCD.2016.7753271

M3 - Conference contribution

AN - SCOPUS:85006826095

T3 - Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016

SP - 133

EP - 140

BT - Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 2 October 2016 through 5 October 2016

ER -

Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this