A performance model and code overlay generator for scratchpad enhanced embedded processors

Michael A. Baker, Amrit Panda, Nikhil Ghadge, Aniruddha Kadne, Karam S. Chatha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, "Code Overlay Generator" (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.

Original languageEnglish (US)
Title of host publication2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010
Pages287-296
Number of pages10
StatePublished - 2010
Event8th IEEE/ACM International Conference on Hardware/Software-Co-Design and System Synthesis, CODES+ISSS 2010 - Scottsdale, AZ, United States
Duration: Oct 24 2010Oct 29 2010

Other

Other8th IEEE/ACM International Conference on Hardware/Software-Co-Design and System Synthesis, CODES+ISSS 2010
CountryUnited States
CityScottsdale, AZ
Period10/24/1010/29/10

Fingerprint

Data storage equipment
Linear programming
Engines
Hardware
Processing

Keywords

  • Cell Broadband Engine
  • Code Mapping
  • Code Overlay
  • Compiler
  • Embedded Systems
  • Scratchpad Memory

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Cite this

Baker, M. A., Panda, A., Ghadge, N., Kadne, A., & Chatha, K. S. (2010). A performance model and code overlay generator for scratchpad enhanced embedded processors. In 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010 (pp. 287-296). [5751513]

A performance model and code overlay generator for scratchpad enhanced embedded processors. / Baker, Michael A.; Panda, Amrit; Ghadge, Nikhil; Kadne, Aniruddha; Chatha, Karam S.

2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010. 2010. p. 287-296 5751513.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Baker, MA, Panda, A, Ghadge, N, Kadne, A & Chatha, KS 2010, A performance model and code overlay generator for scratchpad enhanced embedded processors. in 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010., 5751513, pp. 287-296, 8th IEEE/ACM International Conference on Hardware/Software-Co-Design and System Synthesis, CODES+ISSS 2010, Scottsdale, AZ, United States, 10/24/10.
Baker MA, Panda A, Ghadge N, Kadne A, Chatha KS. A performance model and code overlay generator for scratchpad enhanced embedded processors. In 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010. 2010. p. 287-296. 5751513
Baker, Michael A. ; Panda, Amrit ; Ghadge, Nikhil ; Kadne, Aniruddha ; Chatha, Karam S. / A performance model and code overlay generator for scratchpad enhanced embedded processors. 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010. 2010. pp. 287-296
@inproceedings{bcf9d4310eea4adea61b42d580e6da03,
title = "A performance model and code overlay generator for scratchpad enhanced embedded processors",
abstract = "Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, {"}Code Overlay Generator{"} (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34{\%} compared to the previous algorithm, and 87{\%} with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.",
keywords = "Cell Broadband Engine, Code Mapping, Code Overlay, Compiler, Embedded Systems, Scratchpad Memory",
author = "Baker, {Michael A.} and Amrit Panda and Nikhil Ghadge and Aniruddha Kadne and Chatha, {Karam S.}",
year = "2010",
language = "English (US)",
isbn = "9781605589053",
pages = "287--296",
booktitle = "2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010",

}

TY - GEN

T1 - A performance model and code overlay generator for scratchpad enhanced embedded processors

AU - Baker, Michael A.

AU - Panda, Amrit

AU - Ghadge, Nikhil

AU - Kadne, Aniruddha

AU - Chatha, Karam S.

PY - 2010

Y1 - 2010

N2 - Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, "Code Overlay Generator" (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.

AB - Software managed scratchpad memories (SPMs) provide improved performance and power in embedded processors by reducing required hardware resources. Performance depends strongly on the scheme used to map code and data onto the SPM, but generating optimal mappings can be extremely difficult. Here we address instruction mapping on SPMs and present a performance model and algorithm, "Code Overlay Generator" (COG), for producing high performance dynamic SPM code mappings. Our heuristic does not require profiling information, and is suitable for generating mapping solutions for large programs which are otherwise infeasible using previously proposed Integer Linear Programming (ILP) techniques. We compare our algorithm with a published heuristic and the code overlay mapping algorithm provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. We find an average performance advantage of 34% compared to the previous algorithm, and 87% with respect to spu-gcc. We additionally show that our performance model enables improved tools for offline evaluation of code overlay performance and mapping selection.

KW - Cell Broadband Engine

KW - Code Mapping

KW - Code Overlay

KW - Compiler

KW - Embedded Systems

KW - Scratchpad Memory

UR - http://www.scopus.com/inward/record.url?scp=79956052887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79956052887&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781605589053

SP - 287

EP - 296

BT - 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010

ER -