High throughput data mapping for coarse-grained reconfigurable architectures

Yongjoo Kim, Jongeun Lee, Aviral Shrivastava, Jonghee W. Yoon, Doosan Cho, Yunheung Paek

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Coarse-grained reconfigurable arrays (CGRAs) are a very promising platform, providing both up to 10-100 MOps/mW of power efficiency and software programmability. However, this promise of CGRAs critically hinges on the effectiveness of application mapping onto CGRA platforms. While previous solutions have greatly improved the computation speed, they have largely ignored the impact of the local memory architecture on the achievable power and performance. This paper motivates the need for memory-aware application mapping for CGRAs, and proposes an effective solution for application mapping that considers the effects of various memory architecture parameters including the number of banks, local memory size, and the communication bandwidth between the local memory and the external main memory. Further we propose efficient methods to handle dependent data on a double-buffering local memory, which is necessary for recurrent loops. Our proposed solution achieves 59% reduction in the energy-delay product, which factors into about 47% and 22% reduction in the energy consumption and runtime, respectively, as compared to memory-unaware mapping for realistic local memory architectures. We also show that our scheme scales across a range of applications and memory parameters, and the runtime overhead of handling recurrent loops by our proposed methods can be less than 1%.

Original languageEnglish (US)
Article number6046176
Pages (from-to)1599-1609
Number of pages11
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume30
Issue number11
DOIs
StatePublished - Nov 2011

Fingerprint

Reconfigurable architectures
Throughput
Data storage equipment
Memory architecture
Hinges
Energy utilization
Bandwidth
Communication

Keywords

  • Array mapping
  • bank conflict
  • coarse-grained reconfigurable architecture
  • compilation
  • multi-bank memory

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

High throughput data mapping for coarse-grained reconfigurable architectures. / Kim, Yongjoo; Lee, Jongeun; Shrivastava, Aviral; Yoon, Jonghee W.; Cho, Doosan; Paek, Yunheung.

In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 30, No. 11, 6046176, 11.2011, p. 1599-1609.

Research output: Contribution to journalArticle

Kim, Yongjoo ; Lee, Jongeun ; Shrivastava, Aviral ; Yoon, Jonghee W. ; Cho, Doosan ; Paek, Yunheung. / High throughput data mapping for coarse-grained reconfigurable architectures. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2011 ; Vol. 30, No. 11. pp. 1599-1609.
@article{bc179d602db24b99bcc70e08f6ada6ce,
title = "High throughput data mapping for coarse-grained reconfigurable architectures",
abstract = "Coarse-grained reconfigurable arrays (CGRAs) are a very promising platform, providing both up to 10-100 MOps/mW of power efficiency and software programmability. However, this promise of CGRAs critically hinges on the effectiveness of application mapping onto CGRA platforms. While previous solutions have greatly improved the computation speed, they have largely ignored the impact of the local memory architecture on the achievable power and performance. This paper motivates the need for memory-aware application mapping for CGRAs, and proposes an effective solution for application mapping that considers the effects of various memory architecture parameters including the number of banks, local memory size, and the communication bandwidth between the local memory and the external main memory. Further we propose efficient methods to handle dependent data on a double-buffering local memory, which is necessary for recurrent loops. Our proposed solution achieves 59{\%} reduction in the energy-delay product, which factors into about 47{\%} and 22{\%} reduction in the energy consumption and runtime, respectively, as compared to memory-unaware mapping for realistic local memory architectures. We also show that our scheme scales across a range of applications and memory parameters, and the runtime overhead of handling recurrent loops by our proposed methods can be less than 1{\%}.",
keywords = "Array mapping, bank conflict, coarse-grained reconfigurable architecture, compilation, multi-bank memory",
author = "Yongjoo Kim and Jongeun Lee and Aviral Shrivastava and Yoon, {Jonghee W.} and Doosan Cho and Yunheung Paek",
year = "2011",
month = "11",
doi = "10.1109/TCAD.2011.2161217",
language = "English (US)",
volume = "30",
pages = "1599--1609",
journal = "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems",
issn = "0278-0070",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "11",

}

TY - JOUR

T1 - High throughput data mapping for coarse-grained reconfigurable architectures

AU - Kim, Yongjoo

AU - Lee, Jongeun

AU - Shrivastava, Aviral

AU - Yoon, Jonghee W.

AU - Cho, Doosan

AU - Paek, Yunheung

PY - 2011/11

Y1 - 2011/11

N2 - Coarse-grained reconfigurable arrays (CGRAs) are a very promising platform, providing both up to 10-100 MOps/mW of power efficiency and software programmability. However, this promise of CGRAs critically hinges on the effectiveness of application mapping onto CGRA platforms. While previous solutions have greatly improved the computation speed, they have largely ignored the impact of the local memory architecture on the achievable power and performance. This paper motivates the need for memory-aware application mapping for CGRAs, and proposes an effective solution for application mapping that considers the effects of various memory architecture parameters including the number of banks, local memory size, and the communication bandwidth between the local memory and the external main memory. Further we propose efficient methods to handle dependent data on a double-buffering local memory, which is necessary for recurrent loops. Our proposed solution achieves 59% reduction in the energy-delay product, which factors into about 47% and 22% reduction in the energy consumption and runtime, respectively, as compared to memory-unaware mapping for realistic local memory architectures. We also show that our scheme scales across a range of applications and memory parameters, and the runtime overhead of handling recurrent loops by our proposed methods can be less than 1%.

AB - Coarse-grained reconfigurable arrays (CGRAs) are a very promising platform, providing both up to 10-100 MOps/mW of power efficiency and software programmability. However, this promise of CGRAs critically hinges on the effectiveness of application mapping onto CGRA platforms. While previous solutions have greatly improved the computation speed, they have largely ignored the impact of the local memory architecture on the achievable power and performance. This paper motivates the need for memory-aware application mapping for CGRAs, and proposes an effective solution for application mapping that considers the effects of various memory architecture parameters including the number of banks, local memory size, and the communication bandwidth between the local memory and the external main memory. Further we propose efficient methods to handle dependent data on a double-buffering local memory, which is necessary for recurrent loops. Our proposed solution achieves 59% reduction in the energy-delay product, which factors into about 47% and 22% reduction in the energy consumption and runtime, respectively, as compared to memory-unaware mapping for realistic local memory architectures. We also show that our scheme scales across a range of applications and memory parameters, and the runtime overhead of handling recurrent loops by our proposed methods can be less than 1%.

KW - Array mapping

KW - bank conflict

KW - coarse-grained reconfigurable architecture

KW - compilation

KW - multi-bank memory

UR - http://www.scopus.com/inward/record.url?scp=80054828144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80054828144&partnerID=8YFLogxK

U2 - 10.1109/TCAD.2011.2161217

DO - 10.1109/TCAD.2011.2161217

M3 - Article

VL - 30

SP - 1599

EP - 1609

JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

SN - 0278-0070

IS - 11

M1 - 6046176

ER -