SHiP: Signature-based hit predictor for high performance caching

Carole-Jean Wu, Aamer Jaleel, Will Hasenplaugh, Margaret Martonosi, Simon C. Steely, Joel Emer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

119 Citations (Scopus)

Abstract

The shared last-level caches in CMPs play an important role in improving application performance and reducing off-chip memory bandwidth requirements. In order to use LLCs more efficiently, recent research has shown that changing the re-reference prediction on cache insertions and cache hits can significantly improve cache performance. A fundamental challenge, however, is how to best predict the re-reference pattern of an incoming cache line. This paper shows that cache performance can be improved by correlating the re-reference behavior of a cache line with a unique signature. We investigate the use of memory region, program counter, and instruction sequence history based signatures. We also propose a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Overall, we find that SHiP offers substantial improvements over the baseline LRU replacement and state-of-the-art replacement policy proposals. On average, SHiP improves sequential and multiprogrammed application performance by roughly 10% and 12% over LRU replacement, respectively. Compared to recent replacement policy proposals such as Seg-LRU and SDBP, SHiP nearly doubles the performance gains while requiring less hardware overhead.

Original languageEnglish (US)
Title of host publicationProceedings of the Annual International Symposium on Microarchitecture, MICRO
Pages430-441
Number of pages12
DOIs
StatePublished - 2011
Externally publishedYes
Event44th Annual IEEE/ACM Symposium on Microarchitecture, MICRO 44 - Porto Alegre, RS, Brazil
Duration: Dec 4 2011Dec 7 2011

Other

Other44th Annual IEEE/ACM Symposium on Microarchitecture, MICRO 44
CountryBrazil
CityPorto Alegre, RS
Period12/4/1112/7/11

Fingerprint

Data storage equipment
Hardware
Bandwidth

Keywords

  • replacement
  • reuse distance prediction
  • shared cache

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Wu, C-J., Jaleel, A., Hasenplaugh, W., Martonosi, M., Steely, S. C., & Emer, J. (2011). SHiP: Signature-based hit predictor for high performance caching. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO (pp. 430-441) https://doi.org/10.1145/2155620.2155671

SHiP : Signature-based hit predictor for high performance caching. / Wu, Carole-Jean; Jaleel, Aamer; Hasenplaugh, Will; Martonosi, Margaret; Steely, Simon C.; Emer, Joel.

Proceedings of the Annual International Symposium on Microarchitecture, MICRO. 2011. p. 430-441.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wu, C-J, Jaleel, A, Hasenplaugh, W, Martonosi, M, Steely, SC & Emer, J 2011, SHiP: Signature-based hit predictor for high performance caching. in Proceedings of the Annual International Symposium on Microarchitecture, MICRO. pp. 430-441, 44th Annual IEEE/ACM Symposium on Microarchitecture, MICRO 44, Porto Alegre, RS, Brazil, 12/4/11. https://doi.org/10.1145/2155620.2155671
Wu C-J, Jaleel A, Hasenplaugh W, Martonosi M, Steely SC, Emer J. SHiP: Signature-based hit predictor for high performance caching. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO. 2011. p. 430-441 https://doi.org/10.1145/2155620.2155671
Wu, Carole-Jean ; Jaleel, Aamer ; Hasenplaugh, Will ; Martonosi, Margaret ; Steely, Simon C. ; Emer, Joel. / SHiP : Signature-based hit predictor for high performance caching. Proceedings of the Annual International Symposium on Microarchitecture, MICRO. 2011. pp. 430-441
@inproceedings{790d210691364a95a1a536204105077b,
title = "SHiP: Signature-based hit predictor for high performance caching",
abstract = "The shared last-level caches in CMPs play an important role in improving application performance and reducing off-chip memory bandwidth requirements. In order to use LLCs more efficiently, recent research has shown that changing the re-reference prediction on cache insertions and cache hits can significantly improve cache performance. A fundamental challenge, however, is how to best predict the re-reference pattern of an incoming cache line. This paper shows that cache performance can be improved by correlating the re-reference behavior of a cache line with a unique signature. We investigate the use of memory region, program counter, and instruction sequence history based signatures. We also propose a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Overall, we find that SHiP offers substantial improvements over the baseline LRU replacement and state-of-the-art replacement policy proposals. On average, SHiP improves sequential and multiprogrammed application performance by roughly 10{\%} and 12{\%} over LRU replacement, respectively. Compared to recent replacement policy proposals such as Seg-LRU and SDBP, SHiP nearly doubles the performance gains while requiring less hardware overhead.",
keywords = "replacement, reuse distance prediction, shared cache",
author = "Carole-Jean Wu and Aamer Jaleel and Will Hasenplaugh and Margaret Martonosi and Steely, {Simon C.} and Joel Emer",
year = "2011",
doi = "10.1145/2155620.2155671",
language = "English (US)",
isbn = "9781450310536",
pages = "430--441",
booktitle = "Proceedings of the Annual International Symposium on Microarchitecture, MICRO",

}

TY - GEN

T1 - SHiP

T2 - Signature-based hit predictor for high performance caching

AU - Wu, Carole-Jean

AU - Jaleel, Aamer

AU - Hasenplaugh, Will

AU - Martonosi, Margaret

AU - Steely, Simon C.

AU - Emer, Joel

PY - 2011

Y1 - 2011

N2 - The shared last-level caches in CMPs play an important role in improving application performance and reducing off-chip memory bandwidth requirements. In order to use LLCs more efficiently, recent research has shown that changing the re-reference prediction on cache insertions and cache hits can significantly improve cache performance. A fundamental challenge, however, is how to best predict the re-reference pattern of an incoming cache line. This paper shows that cache performance can be improved by correlating the re-reference behavior of a cache line with a unique signature. We investigate the use of memory region, program counter, and instruction sequence history based signatures. We also propose a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Overall, we find that SHiP offers substantial improvements over the baseline LRU replacement and state-of-the-art replacement policy proposals. On average, SHiP improves sequential and multiprogrammed application performance by roughly 10% and 12% over LRU replacement, respectively. Compared to recent replacement policy proposals such as Seg-LRU and SDBP, SHiP nearly doubles the performance gains while requiring less hardware overhead.

AB - The shared last-level caches in CMPs play an important role in improving application performance and reducing off-chip memory bandwidth requirements. In order to use LLCs more efficiently, recent research has shown that changing the re-reference prediction on cache insertions and cache hits can significantly improve cache performance. A fundamental challenge, however, is how to best predict the re-reference pattern of an incoming cache line. This paper shows that cache performance can be improved by correlating the re-reference behavior of a cache line with a unique signature. We investigate the use of memory region, program counter, and instruction sequence history based signatures. We also propose a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. Overall, we find that SHiP offers substantial improvements over the baseline LRU replacement and state-of-the-art replacement policy proposals. On average, SHiP improves sequential and multiprogrammed application performance by roughly 10% and 12% over LRU replacement, respectively. Compared to recent replacement policy proposals such as Seg-LRU and SDBP, SHiP nearly doubles the performance gains while requiring less hardware overhead.

KW - replacement

KW - reuse distance prediction

KW - shared cache

UR - http://www.scopus.com/inward/record.url?scp=84863389330&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863389330&partnerID=8YFLogxK

U2 - 10.1145/2155620.2155671

DO - 10.1145/2155620.2155671

M3 - Conference contribution

AN - SCOPUS:84863389330

SN - 9781450310536

SP - 430

EP - 441

BT - Proceedings of the Annual International Symposium on Microarchitecture, MICRO

ER -