TY - JOUR
T1 - Return data interleaving for multi-channel embedded CMPs systems
AU - Hong, Fei
AU - Shrivastava, Aviral
AU - Lee, Jongeun
N1 - Funding Information:
Manuscript received July 26, 2010; revised February 04, 2011; accepted March 12, 2011. Date of publication June 30, 2011; date of current version June 01, 2012. This work was supported in part by the National Science Foundation (NSF) under Grants CCF-0916652, CCF-1055094 (CAREER) and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, under Grant 2010-0011534.
PY - 2012
Y1 - 2012
N2 - Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.
AB - Using multi-channel memory subsystems is an efficient way of satisfying high volume memory requests from CMPs. At the same time, the imbalance between memory bandwidth and bus performance opens up new possibility of optimization before they are sent to bus. This paper presents a new memory controller design for embedded CMPs systems when the return data from the return buffer is sent back to bus. Our scheduling policy, called return data interleaving (RDI) interleaves the return data of each request in a round robin manner. Further, for each request, it sends the critical word first. To evaluate our technique, we model an Intel XScale-based CMPs using M5 simulator for CMPs simulation and DRAMsim for memory subsystem simulation and examine the performance of MiBench and SPEC2000 benchmarks. Simulation results show that for memory-bound benchmarks running on the CMPs systems with the number of cores from 6 to 16, RDI can improve the execution time by average 11% and up to 16.9%.
KW - Chip multi-core processor
KW - multi-channel memory
KW - return data interleaving (RDI)
UR - http://www.scopus.com/inward/record.url?scp=84862233003&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862233003&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2011.2157368
DO - 10.1109/TVLSI.2011.2157368
M3 - Article
AN - SCOPUS:84862233003
SN - 1063-8210
VL - 20
SP - 1351
EP - 1354
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 7
M1 - 5936661
ER -