Return data interleaving for multi-channel embedded CMPs systems

Fei Hong; Aviral Shrivastava; Jongeun Lee

doi:10.1109/TVLSI.2011.2157368

Return data interleaving for multi-channel embedded CMPs systems

Fei Hong, Aviral Shrivastava, Jongeun Lee

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.

Original language	English (US)
Article number	5936661
Pages (from-to)	1351-1354
Number of pages	4
Journal	IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume	20
Issue number	7
DOIs	https://doi.org/10.1109/TVLSI.2011.2157368
State	Published - 2012

Keywords

Chip multi-core processor
multi-channel memory
return data interleaving (RDI)

ASJC Scopus subject areas

Software
Hardware and Architecture
Electrical and Electronic Engineering

Access to Document

10.1109/TVLSI.2011.2157368

Cite this

@article{85e32b79c50943df8b835122fedaeab5,

title = "Return data interleaving for multi-channel embedded CMPs systems",

abstract = "Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.",

keywords = "Chip multi-core processor, multi-channel memory, return data interleaving (RDI)",

author = "Fei Hong and Aviral Shrivastava and Jongeun Lee",

note = "Funding Information: Manuscript received July 26, 2010; revised February 04, 2011; accepted March 12, 2011. Date of publication June 30, 2011; date of current version June 01, 2012. This work was supported in part by the National Science Foundation (NSF) under Grants CCF-0916652, CCF-1055094 (CAREER) and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, under Grant 2010-0011534.",

year = "2012",

doi = "10.1109/TVLSI.2011.2157368",

language = "English (US)",

volume = "20",

pages = "1351--1354",

journal = "IEEE Transactions on Very Large Scale Integration (VLSI) Systems",

issn = "1063-8210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - Return data interleaving for multi-channel embedded CMPs systems

AU - Hong, Fei

AU - Shrivastava, Aviral

AU - Lee, Jongeun

N1 - Funding Information: Manuscript received July 26, 2010; revised February 04, 2011; accepted March 12, 2011. Date of publication June 30, 2011; date of current version June 01, 2012. This work was supported in part by the National Science Foundation (NSF) under Grants CCF-0916652, CCF-1055094 (CAREER) and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, under Grant 2010-0011534.

PY - 2012

Y1 - 2012

N2 - Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.

AB - Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.

KW - Chip multi-core processor

KW - multi-channel memory

KW - return data interleaving (RDI)

UR - http://www.scopus.com/inward/record.url?scp=84862233003&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84862233003&partnerID=8YFLogxK

U2 - 10.1109/TVLSI.2011.2157368

DO - 10.1109/TVLSI.2011.2157368

M3 - Article

AN - SCOPUS:84862233003

SN - 1063-8210

VL - 20

SP - 1351

EP - 1354

JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

IS - 7

M1 - 5936661

ER -

Return data interleaving for multi-channel embedded CMPs systems

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this