Operation and data mapping for CGRAs with multi-bank memory

Yongjoo Kim; Jongeun Lee; Aviral Shrivastava; Yunheung Paek

doi:10.1145/1755888.1755892

Operation and data mapping for CGRAs with multi-bank memory

Yongjoo Kim, Jongeun Lee, Aviral Shrivastava, Yunheung Paek

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

9 Scopus citations

Abstract

Coarse Grain Reconfigurable Architectures (CGRAs) promise high performance at high power efficiency. They fulfil this promise by keeping the hardware extremely simple, and moving the complexity to application mapping. One major challenge comes in the form of data mapping. For reasons of power-efficiency and complexity, CGRAs use multi-bank local memory, and a row of PEs share memory access. In order for each row of the PEs to access any memory bank, there is a hardware arbiter between the memory requests generated by the PEs and the banks of the local memory. However, a fundamental restriction remains that a bank cannot be accessed by two different PEs at the same time. We propose to meet this challenge by mapping application operations onto PEs and data into memory banks in a way that avoids such conflicts. Our experimental results on kernels from multimedia benchmarks demonstrate that our local memory-aware compilation approach can generate mappings that are up to 40% better in performance (17.3% on average) compared to a memory-unaware scheduler.

Original language	English (US)
Title of host publication	LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems
Pages	17-25
Number of pages	9
DOIs	https://doi.org/10.1145/1755888.1755892
State	Published - 2010
Event	ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010 - Stockholm, Sweden Duration: Apr 13 2010 → Apr 15 2010

Publication series

Name	Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)

Conference

Conference	ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010
Country/Territory	Sweden
City	Stockholm
Period	4/13/10 → 4/15/10

Keywords

arbiter
bank conflict
coarse-grained reconfigurable architecture
compilation
multi-bank memory

ASJC Scopus subject areas

Software

Access to Document

10.1145/1755888.1755892

Cite this

Kim, Y., Lee, J., Shrivastava, A., & Paek, Y. (2010). Operation and data mapping for CGRAs with multi-bank memory. In LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems (pp. 17-25). (Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)). https://doi.org/10.1145/1755888.1755892

Operation and data mapping for CGRAs with multi-bank memory. / Kim, Yongjoo; Lee, Jongeun; Shrivastava, Aviral et al.
LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems. 2010. p. 17-25 (Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Kim, Y, Lee, J, Shrivastava, A & Paek, Y 2010, Operation and data mapping for CGRAs with multi-bank memory. in LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems. Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pp. 17-25, ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010, Stockholm, Sweden, 4/13/10. https://doi.org/10.1145/1755888.1755892

Kim Y, Lee J, Shrivastava A, Paek Y. Operation and data mapping for CGRAs with multi-bank memory. In LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems. 2010. p. 17-25. (Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)). doi: 10.1145/1755888.1755892

@inproceedings{4e2a19e164534f0c846c020c94bf3dcb,

title = "Operation and data mapping for CGRAs with multi-bank memory",

abstract = "Coarse Grain Reconfigurable Architectures (CGRAs) promise high performance at high power efficiency. They fulfil this promise by keeping the hardware extremely simple, and moving the complexity to application mapping. One major challenge comes in the form of data mapping. For reasons of power-efficiency and complexity, CGRAs use multi-bank local memory, and a row of PEs share memory access. In order for each row of the PEs to access any memory bank, there is a hardware arbiter between the memory requests generated by the PEs and the banks of the local memory. However, a fundamental restriction remains that a bank cannot be accessed by two different PEs at the same time. We propose to meet this challenge by mapping application operations onto PEs and data into memory banks in a way that avoids such conflicts. Our experimental results on kernels from multimedia benchmarks demonstrate that our local memory-aware compilation approach can generate mappings that are up to 40% better in performance (17.3% on average) compared to a memory-unaware scheduler.",

keywords = "arbiter, bank conflict, coarse-grained reconfigurable architecture, compilation, multi-bank memory",

author = "Yongjoo Kim and Jongeun Lee and Aviral Shrivastava and Yunheung Paek",

year = "2010",

doi = "10.1145/1755888.1755892",

language = "English (US)",

isbn = "9781605589534",

series = "Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)",

pages = "17--25",

booktitle = "LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems",

note = "ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010 ; Conference date: 13-04-2010 Through 15-04-2010",

}

TY - GEN

T1 - Operation and data mapping for CGRAs with multi-bank memory

AU - Kim, Yongjoo

AU - Lee, Jongeun

AU - Shrivastava, Aviral

AU - Paek, Yunheung

PY - 2010

Y1 - 2010

N2 - Coarse Grain Reconfigurable Architectures (CGRAs) promise high performance at high power efficiency. They fulfil this promise by keeping the hardware extremely simple, and moving the complexity to application mapping. One major challenge comes in the form of data mapping. For reasons of power-efficiency and complexity, CGRAs use multi-bank local memory, and a row of PEs share memory access. In order for each row of the PEs to access any memory bank, there is a hardware arbiter between the memory requests generated by the PEs and the banks of the local memory. However, a fundamental restriction remains that a bank cannot be accessed by two different PEs at the same time. We propose to meet this challenge by mapping application operations onto PEs and data into memory banks in a way that avoids such conflicts. Our experimental results on kernels from multimedia benchmarks demonstrate that our local memory-aware compilation approach can generate mappings that are up to 40% better in performance (17.3% on average) compared to a memory-unaware scheduler.

AB - Coarse Grain Reconfigurable Architectures (CGRAs) promise high performance at high power efficiency. They fulfil this promise by keeping the hardware extremely simple, and moving the complexity to application mapping. One major challenge comes in the form of data mapping. For reasons of power-efficiency and complexity, CGRAs use multi-bank local memory, and a row of PEs share memory access. In order for each row of the PEs to access any memory bank, there is a hardware arbiter between the memory requests generated by the PEs and the banks of the local memory. However, a fundamental restriction remains that a bank cannot be accessed by two different PEs at the same time. We propose to meet this challenge by mapping application operations onto PEs and data into memory banks in a way that avoids such conflicts. Our experimental results on kernels from multimedia benchmarks demonstrate that our local memory-aware compilation approach can generate mappings that are up to 40% better in performance (17.3% on average) compared to a memory-unaware scheduler.

KW - arbiter

KW - bank conflict

KW - coarse-grained reconfigurable architecture

KW - compilation

KW - multi-bank memory

UR - http://www.scopus.com/inward/record.url?scp=77954467387&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954467387&partnerID=8YFLogxK

U2 - 10.1145/1755888.1755892

DO - 10.1145/1755888.1755892

M3 - Conference contribution

AN - SCOPUS:77954467387

SN - 9781605589534

T3 - Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)

SP - 17

EP - 25

BT - LCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems

T2 - ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010

Y2 - 13 April 2010 through 15 April 2010

ER -

Operation and data mapping for CGRAs with multi-bank memory

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this