REGIMap

Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs)

Research output: Chapter in Book/Report/Conference proceedingConference contribution

41 Citations (Scopus)

Abstract

Coarse-Grained Reconfigurable Architectures (CGRAs) are an extremely attractive platform when both performance and power efficiency are paramount. Although the power-efficiency of CGRAs can be very high, their performance critically hinges upon the capabilities of the compiler. This is because a CGRA compiler has to perform explicit pipelining, scheduling, placement, and routing of operations. Existing CGRA compilers struggle with two main problems: 1) effectively utilizing the local register files in the PEs, and 2) high compilation times. This paper significantly improves the state-of-the-art in CGRA compilers by first creating a precise and general formulation of the problem of loop mapping on CGRAs, considering the local registers, and from the insights gained from the problem formulation, distilling an efficient and constructive heuristic solution. We show that the mapping problem, once characterized, can be reduced to the problem of finding maximal weighted clique in the product graph of the time-extended CGRA and the data dependence graph of the kernel. The heuristic we've developed results in average of 1.89 X better performance than the state-of-the-art methods when applied to several kernels from multimedia and SPEC2006 benchmarks. A unique feature of our heuristic is that it learns from failed attempts and constructively changes the schedule to achieve better mappings at lower compilation times.

Original languageEnglish (US)
Title of host publicationProceedings - Design Automation Conference
DOIs
StatePublished - 2013
Event50th Annual Design Automation Conference, DAC 2013 - Austin, TX, United States
Duration: May 29 2013Jun 7 2013

Other

Other50th Annual Design Automation Conference, DAC 2013
CountryUnited States
CityAustin, TX
Period5/29/136/7/13

Fingerprint

Reconfigurable architectures
Reconfigurable Architectures
Compiler
Heuristics
Compilation
kernel
Product Graph
Data Dependence
Pipelining
Formulation
Hinges
Clique
Placement
Multimedia
Schedule
Routing
High Performance
Scheduling
Benchmark
Graph in graph theory

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modeling and Simulation

Cite this

REGIMap : Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). / Hamzeh, Mahdi; Shrivastava, Aviral; Vrudhula, Sarma.

Proceedings - Design Automation Conference. 2013. 18.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hamzeh, M, Shrivastava, A & Vrudhula, S 2013, REGIMap: Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). in Proceedings - Design Automation Conference., 18, 50th Annual Design Automation Conference, DAC 2013, Austin, TX, United States, 5/29/13. https://doi.org/10.1145/2463209.2488756
@inproceedings{e94762d773684e67ba343f3c8c78aea9,
title = "REGIMap: Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs)",
abstract = "Coarse-Grained Reconfigurable Architectures (CGRAs) are an extremely attractive platform when both performance and power efficiency are paramount. Although the power-efficiency of CGRAs can be very high, their performance critically hinges upon the capabilities of the compiler. This is because a CGRA compiler has to perform explicit pipelining, scheduling, placement, and routing of operations. Existing CGRA compilers struggle with two main problems: 1) effectively utilizing the local register files in the PEs, and 2) high compilation times. This paper significantly improves the state-of-the-art in CGRA compilers by first creating a precise and general formulation of the problem of loop mapping on CGRAs, considering the local registers, and from the insights gained from the problem formulation, distilling an efficient and constructive heuristic solution. We show that the mapping problem, once characterized, can be reduced to the problem of finding maximal weighted clique in the product graph of the time-extended CGRA and the data dependence graph of the kernel. The heuristic we've developed results in average of 1.89 X better performance than the state-of-the-art methods when applied to several kernels from multimedia and SPEC2006 benchmarks. A unique feature of our heuristic is that it learns from failed attempts and constructively changes the schedule to achieve better mappings at lower compilation times.",
author = "Mahdi Hamzeh and Aviral Shrivastava and Sarma Vrudhula",
year = "2013",
doi = "10.1145/2463209.2488756",
language = "English (US)",
isbn = "9781450320719",
booktitle = "Proceedings - Design Automation Conference",

}

TY - GEN

T1 - REGIMap

T2 - Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs)

AU - Hamzeh, Mahdi

AU - Shrivastava, Aviral

AU - Vrudhula, Sarma

PY - 2013

Y1 - 2013

N2 - Coarse-Grained Reconfigurable Architectures (CGRAs) are an extremely attractive platform when both performance and power efficiency are paramount. Although the power-efficiency of CGRAs can be very high, their performance critically hinges upon the capabilities of the compiler. This is because a CGRA compiler has to perform explicit pipelining, scheduling, placement, and routing of operations. Existing CGRA compilers struggle with two main problems: 1) effectively utilizing the local register files in the PEs, and 2) high compilation times. This paper significantly improves the state-of-the-art in CGRA compilers by first creating a precise and general formulation of the problem of loop mapping on CGRAs, considering the local registers, and from the insights gained from the problem formulation, distilling an efficient and constructive heuristic solution. We show that the mapping problem, once characterized, can be reduced to the problem of finding maximal weighted clique in the product graph of the time-extended CGRA and the data dependence graph of the kernel. The heuristic we've developed results in average of 1.89 X better performance than the state-of-the-art methods when applied to several kernels from multimedia and SPEC2006 benchmarks. A unique feature of our heuristic is that it learns from failed attempts and constructively changes the schedule to achieve better mappings at lower compilation times.

AB - Coarse-Grained Reconfigurable Architectures (CGRAs) are an extremely attractive platform when both performance and power efficiency are paramount. Although the power-efficiency of CGRAs can be very high, their performance critically hinges upon the capabilities of the compiler. This is because a CGRA compiler has to perform explicit pipelining, scheduling, placement, and routing of operations. Existing CGRA compilers struggle with two main problems: 1) effectively utilizing the local register files in the PEs, and 2) high compilation times. This paper significantly improves the state-of-the-art in CGRA compilers by first creating a precise and general formulation of the problem of loop mapping on CGRAs, considering the local registers, and from the insights gained from the problem formulation, distilling an efficient and constructive heuristic solution. We show that the mapping problem, once characterized, can be reduced to the problem of finding maximal weighted clique in the product graph of the time-extended CGRA and the data dependence graph of the kernel. The heuristic we've developed results in average of 1.89 X better performance than the state-of-the-art methods when applied to several kernels from multimedia and SPEC2006 benchmarks. A unique feature of our heuristic is that it learns from failed attempts and constructively changes the schedule to achieve better mappings at lower compilation times.

UR - http://www.scopus.com/inward/record.url?scp=84879852172&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879852172&partnerID=8YFLogxK

U2 - 10.1145/2463209.2488756

DO - 10.1145/2463209.2488756

M3 - Conference contribution

SN - 9781450320719

BT - Proceedings - Design Automation Conference

ER -