Adaptive timekeeping replacement

Fine-grained capacity management for shared CMP caches

Carole-Jean Wu, Margaret Martonosi

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

In chip multiprocessors (CMPs), several high-performance cores typically compete for capacity in a shared last-level cache. This causes degraded and unpredictable memory performance for multiprogrammed and parallel workloads. In response, recent schemes apportion cache bandwidth and capacity in ways that offer better aggregate performance for the workloads. These schemes, however, focus primarily on relatively coarse-grained capacity management without concern for operating system process priority levels. In this work, we explore capacity management approaches that are both temporally and spatially more fine-grained than prior work. We also consider operating system priority levels as part of capacity management. We propose a capacity management mechanism based on timekeeping techniques that track the time interval since the last access to cached data. This Adaptive Timekeeping Replacement (ATR) scheme maintains aggregate cache occupancies that reflect the priority and footprint of each application. The key novelties of our work are (1) ATR offers a complete cache capacity management framework taking into account application priorities and memory characteristics, and (2) ATR's fine-grained cache capacity control is demonstrated to be effective and important in improving the performance of parallel workloads in addition to sequential ones. We evaluate our ideas using a full-system simulator and multiprogrammed workloads of both sequential and parallel applications. This is the first detailed study of shared cache capacity management considering thread behaviors in parallel applications. ATR outperforms an unmanaged system by as much as 1.63X and by an average of 1.19X. ATR's fine-grained temporal control is particularly important for parallel applications, which are expected to be increasingly prevalent in years to come.

Original languageEnglish (US)
Article number3
JournalTransactions on Architecture and Code Optimization
Volume8
Issue number1
DOIs
StatePublished - Apr 2011
Externally publishedYes

Fingerprint

Data storage equipment
Simulators
Bandwidth

Keywords

  • Cache decay
  • Capacity management
  • Shared resource management

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Information Systems

Cite this

Adaptive timekeeping replacement : Fine-grained capacity management for shared CMP caches. / Wu, Carole-Jean; Martonosi, Margaret.

In: Transactions on Architecture and Code Optimization, Vol. 8, No. 1, 3, 04.2011.

Research output: Contribution to journalArticle

@article{1878df5cc43d411ebd15244e0dd5c5b7,
title = "Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches",
abstract = "In chip multiprocessors (CMPs), several high-performance cores typically compete for capacity in a shared last-level cache. This causes degraded and unpredictable memory performance for multiprogrammed and parallel workloads. In response, recent schemes apportion cache bandwidth and capacity in ways that offer better aggregate performance for the workloads. These schemes, however, focus primarily on relatively coarse-grained capacity management without concern for operating system process priority levels. In this work, we explore capacity management approaches that are both temporally and spatially more fine-grained than prior work. We also consider operating system priority levels as part of capacity management. We propose a capacity management mechanism based on timekeeping techniques that track the time interval since the last access to cached data. This Adaptive Timekeeping Replacement (ATR) scheme maintains aggregate cache occupancies that reflect the priority and footprint of each application. The key novelties of our work are (1) ATR offers a complete cache capacity management framework taking into account application priorities and memory characteristics, and (2) ATR's fine-grained cache capacity control is demonstrated to be effective and important in improving the performance of parallel workloads in addition to sequential ones. We evaluate our ideas using a full-system simulator and multiprogrammed workloads of both sequential and parallel applications. This is the first detailed study of shared cache capacity management considering thread behaviors in parallel applications. ATR outperforms an unmanaged system by as much as 1.63X and by an average of 1.19X. ATR's fine-grained temporal control is particularly important for parallel applications, which are expected to be increasingly prevalent in years to come.",
keywords = "Cache decay, Capacity management, Shared resource management",
author = "Carole-Jean Wu and Margaret Martonosi",
year = "2011",
month = "4",
doi = "10.1145/1952998.1953001",
language = "English (US)",
volume = "8",
journal = "Transactions on Architecture and Code Optimization",
issn = "1544-3566",
publisher = "Association for Computing Machinery (ACM)",
number = "1",

}

TY - JOUR

T1 - Adaptive timekeeping replacement

T2 - Fine-grained capacity management for shared CMP caches

AU - Wu, Carole-Jean

AU - Martonosi, Margaret

PY - 2011/4

Y1 - 2011/4

N2 - In chip multiprocessors (CMPs), several high-performance cores typically compete for capacity in a shared last-level cache. This causes degraded and unpredictable memory performance for multiprogrammed and parallel workloads. In response, recent schemes apportion cache bandwidth and capacity in ways that offer better aggregate performance for the workloads. These schemes, however, focus primarily on relatively coarse-grained capacity management without concern for operating system process priority levels. In this work, we explore capacity management approaches that are both temporally and spatially more fine-grained than prior work. We also consider operating system priority levels as part of capacity management. We propose a capacity management mechanism based on timekeeping techniques that track the time interval since the last access to cached data. This Adaptive Timekeeping Replacement (ATR) scheme maintains aggregate cache occupancies that reflect the priority and footprint of each application. The key novelties of our work are (1) ATR offers a complete cache capacity management framework taking into account application priorities and memory characteristics, and (2) ATR's fine-grained cache capacity control is demonstrated to be effective and important in improving the performance of parallel workloads in addition to sequential ones. We evaluate our ideas using a full-system simulator and multiprogrammed workloads of both sequential and parallel applications. This is the first detailed study of shared cache capacity management considering thread behaviors in parallel applications. ATR outperforms an unmanaged system by as much as 1.63X and by an average of 1.19X. ATR's fine-grained temporal control is particularly important for parallel applications, which are expected to be increasingly prevalent in years to come.

AB - In chip multiprocessors (CMPs), several high-performance cores typically compete for capacity in a shared last-level cache. This causes degraded and unpredictable memory performance for multiprogrammed and parallel workloads. In response, recent schemes apportion cache bandwidth and capacity in ways that offer better aggregate performance for the workloads. These schemes, however, focus primarily on relatively coarse-grained capacity management without concern for operating system process priority levels. In this work, we explore capacity management approaches that are both temporally and spatially more fine-grained than prior work. We also consider operating system priority levels as part of capacity management. We propose a capacity management mechanism based on timekeeping techniques that track the time interval since the last access to cached data. This Adaptive Timekeeping Replacement (ATR) scheme maintains aggregate cache occupancies that reflect the priority and footprint of each application. The key novelties of our work are (1) ATR offers a complete cache capacity management framework taking into account application priorities and memory characteristics, and (2) ATR's fine-grained cache capacity control is demonstrated to be effective and important in improving the performance of parallel workloads in addition to sequential ones. We evaluate our ideas using a full-system simulator and multiprogrammed workloads of both sequential and parallel applications. This is the first detailed study of shared cache capacity management considering thread behaviors in parallel applications. ATR outperforms an unmanaged system by as much as 1.63X and by an average of 1.19X. ATR's fine-grained temporal control is particularly important for parallel applications, which are expected to be increasingly prevalent in years to come.

KW - Cache decay

KW - Capacity management

KW - Shared resource management

UR - http://www.scopus.com/inward/record.url?scp=79955678335&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79955678335&partnerID=8YFLogxK

U2 - 10.1145/1952998.1953001

DO - 10.1145/1952998.1953001

M3 - Article

VL - 8

JO - Transactions on Architecture and Code Optimization

JF - Transactions on Architecture and Code Optimization

SN - 1544-3566

IS - 1

M1 - 3

ER -