Unrolling and retiming of stream applications onto embedded multicore processors

Weijia Che, Karam S. Chatha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.

Original languageEnglish (US)
Title of host publicationProceedings - Design Automation Conference
Pages1272-1277
Number of pages6
DOIs
StatePublished - 2012
Event49th Annual Design Automation Conference, DAC '12 - San Francisco, CA, United States
Duration: Jun 3 2012Jun 7 2012

Other

Other49th Annual Design Automation Conference, DAC '12
CountryUnited States
CitySan Francisco, CA
Period6/3/126/7/12

Fingerprint

Multi-core Processor
Data storage equipment
Overlay
Cell Processor
Prefetching
Data Communication
Compilation
Computer programming languages
Programming Languages
Well-defined
High Performance
Communication

Keywords

  • multicore
  • overlay
  • retiming
  • SPM
  • stream
  • unrolling

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modeling and Simulation

Cite this

Che, W., & Chatha, K. S. (2012). Unrolling and retiming of stream applications onto embedded multicore processors. In Proceedings - Design Automation Conference (pp. 1272-1277) https://doi.org/10.1145/2228360.2228598

Unrolling and retiming of stream applications onto embedded multicore processors. / Che, Weijia; Chatha, Karam S.

Proceedings - Design Automation Conference. 2012. p. 1272-1277.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Che, W & Chatha, KS 2012, Unrolling and retiming of stream applications onto embedded multicore processors. in Proceedings - Design Automation Conference. pp. 1272-1277, 49th Annual Design Automation Conference, DAC '12, San Francisco, CA, United States, 6/3/12. https://doi.org/10.1145/2228360.2228598
Che W, Chatha KS. Unrolling and retiming of stream applications onto embedded multicore processors. In Proceedings - Design Automation Conference. 2012. p. 1272-1277 https://doi.org/10.1145/2228360.2228598
Che, Weijia ; Chatha, Karam S. / Unrolling and retiming of stream applications onto embedded multicore processors. Proceedings - Design Automation Conference. 2012. pp. 1272-1277
@inproceedings{1fd64aaa53694306a931983f62c3c193,
title = "Unrolling and retiming of stream applications onto embedded multicore processors",
abstract = "In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.",
keywords = "multicore, overlay, retiming, SPM, stream, unrolling",
author = "Weijia Che and Chatha, {Karam S.}",
year = "2012",
doi = "10.1145/2228360.2228598",
language = "English (US)",
isbn = "9781450311991",
pages = "1272--1277",
booktitle = "Proceedings - Design Automation Conference",

}

TY - GEN

T1 - Unrolling and retiming of stream applications onto embedded multicore processors

AU - Che, Weijia

AU - Chatha, Karam S.

PY - 2012

Y1 - 2012

N2 - In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.

AB - In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.

KW - multicore

KW - overlay

KW - retiming

KW - SPM

KW - stream

KW - unrolling

UR - http://www.scopus.com/inward/record.url?scp=84863544272&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863544272&partnerID=8YFLogxK

U2 - 10.1145/2228360.2228598

DO - 10.1145/2228360.2228598

M3 - Conference contribution

SN - 9781450311991

SP - 1272

EP - 1277

BT - Proceedings - Design Automation Conference

ER -