Reducing energy and increasing performance with traffic optimization in many-core systems

George B.P. Bezerra; Stephanie Forrest; Payman Zarkesh-Ha

doi:10.1109/SLIP.2011.6135429

Reducing energy and increasing performance with traffic optimization in many-core systems

George B.P. Bezerra, Stephanie Forrest, Payman Zarkesh-Ha

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

As the number of cores on a die continues to increase, it is necessary to optimize the traffic patterns of applications in order to minimize power consumption and maximize performance. We present a new approach for traffic optimization in many-core systems, which targets communication locality and load-balancing. Our approach works by mapping memory blocks to physical locations on the chip that are close to cores that access them, and by enforcing load balance by limiting the number of blocks mapped to each location. Communication locality reduces the average distance traveled by packets, which minimizes power and increases performance. Load-balancing avoids hotspots and improves cache utilization. Rather than treating every application in the same way, our method uses available information to produce mappings that are specially tuned for individual applications. Simulations performed on a 64-core system show a reduction in dynamic energy consumption of up to 81.6% and of 45.5% on average, and gains in performance of up to 13.2% on scientific benchmarks.

Original language	English (US)
Title of host publication	2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011
DOIs	https://doi.org/10.1109/SLIP.2011.6135429
State	Published - 2011
Externally published	Yes
Event	2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011 - San Diego, CA, United States Duration: Jun 5 2011 → Jun 5 2011

Publication series

Name	International Workshop on System Level Interconnect Prediction, SLIP

Other

Other	2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011
Country/Territory	United States
City	San Diego, CA
Period	6/5/11 → 6/5/11

Keywords

Traffic optimization
communication graph
communication locality
load-balancing
many-core
memory-block mapping
non-uniform cache access

ASJC Scopus subject areas

Hardware and Architecture
Electrical and Electronic Engineering
Computer Science Applications
Applied Mathematics

Access to Document

10.1109/SLIP.2011.6135429

Cite this

Reducing energy and increasing performance with traffic optimization in many-core systems. / Bezerra, George B.P.; Forrest, Stephanie; Zarkesh-Ha, Payman.
2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011. 2011. 6135429 (International Workshop on System Level Interconnect Prediction, SLIP).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Bezerra, GBP, Forrest, S & Zarkesh-Ha, P 2011, Reducing energy and increasing performance with traffic optimization in many-core systems. in 2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011., 6135429, International Workshop on System Level Interconnect Prediction, SLIP, 2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011, San Diego, CA, United States, 6/5/11. https://doi.org/10.1109/SLIP.2011.6135429

@inproceedings{b7a3b78567b4449790aafd89916a5d68,

title = "Reducing energy and increasing performance with traffic optimization in many-core systems",

abstract = "As the number of cores on a die continues to increase, it is necessary to optimize the traffic patterns of applications in order to minimize power consumption and maximize performance. We present a new approach for traffic optimization in many-core systems, which targets communication locality and load-balancing. Our approach works by mapping memory blocks to physical locations on the chip that are close to cores that access them, and by enforcing load balance by limiting the number of blocks mapped to each location. Communication locality reduces the average distance traveled by packets, which minimizes power and increases performance. Load-balancing avoids hotspots and improves cache utilization. Rather than treating every application in the same way, our method uses available information to produce mappings that are specially tuned for individual applications. Simulations performed on a 64-core system show a reduction in dynamic energy consumption of up to 81.6% and of 45.5% on average, and gains in performance of up to 13.2% on scientific benchmarks.",

keywords = "Traffic optimization, communication graph, communication locality, load-balancing, many-core, memory-block mapping, non-uniform cache access",

author = "Bezerra, {George B.P.} and Stephanie Forrest and Payman Zarkesh-Ha",

year = "2011",

doi = "10.1109/SLIP.2011.6135429",

language = "English (US)",

isbn = "9781457712401",

series = "International Workshop on System Level Interconnect Prediction, SLIP",

booktitle = "2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011",

note = "2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011 ; Conference date: 05-06-2011 Through 05-06-2011",

}

TY - GEN

T1 - Reducing energy and increasing performance with traffic optimization in many-core systems

AU - Bezerra, George B.P.

AU - Forrest, Stephanie

AU - Zarkesh-Ha, Payman

PY - 2011

Y1 - 2011

N2 - As the number of cores on a die continues to increase, it is necessary to optimize the traffic patterns of applications in order to minimize power consumption and maximize performance. We present a new approach for traffic optimization in many-core systems, which targets communication locality and load-balancing. Our approach works by mapping memory blocks to physical locations on the chip that are close to cores that access them, and by enforcing load balance by limiting the number of blocks mapped to each location. Communication locality reduces the average distance traveled by packets, which minimizes power and increases performance. Load-balancing avoids hotspots and improves cache utilization. Rather than treating every application in the same way, our method uses available information to produce mappings that are specially tuned for individual applications. Simulations performed on a 64-core system show a reduction in dynamic energy consumption of up to 81.6% and of 45.5% on average, and gains in performance of up to 13.2% on scientific benchmarks.

AB - As the number of cores on a die continues to increase, it is necessary to optimize the traffic patterns of applications in order to minimize power consumption and maximize performance. We present a new approach for traffic optimization in many-core systems, which targets communication locality and load-balancing. Our approach works by mapping memory blocks to physical locations on the chip that are close to cores that access them, and by enforcing load balance by limiting the number of blocks mapped to each location. Communication locality reduces the average distance traveled by packets, which minimizes power and increases performance. Load-balancing avoids hotspots and improves cache utilization. Rather than treating every application in the same way, our method uses available information to produce mappings that are specially tuned for individual applications. Simulations performed on a 64-core system show a reduction in dynamic energy consumption of up to 81.6% and of 45.5% on average, and gains in performance of up to 13.2% on scientific benchmarks.

KW - Traffic optimization

KW - communication graph

KW - communication locality

KW - load-balancing

KW - many-core

KW - memory-block mapping

KW - non-uniform cache access

UR - http://www.scopus.com/inward/record.url?scp=84857205032&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857205032&partnerID=8YFLogxK

U2 - 10.1109/SLIP.2011.6135429

DO - 10.1109/SLIP.2011.6135429

M3 - Conference contribution

AN - SCOPUS:84857205032

SN - 9781457712401

T3 - International Workshop on System Level Interconnect Prediction, SLIP

BT - 2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011

T2 - 2011 13th International Workshop on System Level Interconnect Prediction, SLIP 2011

Y2 - 5 June 2011 through 5 June 2011

ER -

Reducing energy and increasing performance with traffic optimization in many-core systems

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this