Approximation algorithm for data mapping on block multi-threaded network processor architectures

Chris Ostler, Karam S. Chatha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Network processor architectures incorporate block multi-threading to alleviate the performance degradation due to memory access latencies. Application design on such architectures requires the determination of the number of threads, and mapping of data items to the various memory elements such that the overall throughput is maximized. The paper presents a quasi-polynomial time approximation algorithm for the multi-threading aware data mapping problem which can be shown to be NP complete. The algorithm generates solutions with throughput no less than 1/2(1+ε) of optimal and data memory requirements no more than (1 + ε) times the memory constraints. Experimental results obtained by mapping applications on the Intel IXP 2400 network processor demonstrate that the algorithm is able to generate solutions whose throughput is within 80% of the optimal when ε = 0.5.

Original languageEnglish (US)
Title of host publicationProceedings - Design Automation Conference
Pages801-804
Number of pages4
DOIs
StatePublished - 2007
Event2007 44th ACM/IEEE Design Automation Conference, DAC'07 - San Diego, CA, United States
Duration: Jun 4 2007Jun 8 2007

Other

Other2007 44th ACM/IEEE Design Automation Conference, DAC'07
CountryUnited States
CitySan Diego, CA
Period6/4/076/8/07

Fingerprint

Approximation algorithms
Data storage equipment
Throughput
Polynomials
Degradation

Keywords

  • Block multi-threading
  • Network processing

ASJC Scopus subject areas

  • Hardware and Architecture
  • Control and Systems Engineering

Cite this

Ostler, C., & Chatha, K. S. (2007). Approximation algorithm for data mapping on block multi-threaded network processor architectures. In Proceedings - Design Automation Conference (pp. 801-804). [4261293] https://doi.org/10.1109/DAC.2007.375274

Approximation algorithm for data mapping on block multi-threaded network processor architectures. / Ostler, Chris; Chatha, Karam S.

Proceedings - Design Automation Conference. 2007. p. 801-804 4261293.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ostler, C & Chatha, KS 2007, Approximation algorithm for data mapping on block multi-threaded network processor architectures. in Proceedings - Design Automation Conference., 4261293, pp. 801-804, 2007 44th ACM/IEEE Design Automation Conference, DAC'07, San Diego, CA, United States, 6/4/07. https://doi.org/10.1109/DAC.2007.375274
Ostler C, Chatha KS. Approximation algorithm for data mapping on block multi-threaded network processor architectures. In Proceedings - Design Automation Conference. 2007. p. 801-804. 4261293 https://doi.org/10.1109/DAC.2007.375274
Ostler, Chris ; Chatha, Karam S. / Approximation algorithm for data mapping on block multi-threaded network processor architectures. Proceedings - Design Automation Conference. 2007. pp. 801-804
@inproceedings{1ac51aad9f6b41cda801e1c9e220e3d0,
title = "Approximation algorithm for data mapping on block multi-threaded network processor architectures",
abstract = "Network processor architectures incorporate block multi-threading to alleviate the performance degradation due to memory access latencies. Application design on such architectures requires the determination of the number of threads, and mapping of data items to the various memory elements such that the overall throughput is maximized. The paper presents a quasi-polynomial time approximation algorithm for the multi-threading aware data mapping problem which can be shown to be NP complete. The algorithm generates solutions with throughput no less than 1/2(1+ε) of optimal and data memory requirements no more than (1 + ε) times the memory constraints. Experimental results obtained by mapping applications on the Intel IXP 2400 network processor demonstrate that the algorithm is able to generate solutions whose throughput is within 80{\%} of the optimal when ε = 0.5.",
keywords = "Block multi-threading, Network processing",
author = "Chris Ostler and Chatha, {Karam S.}",
year = "2007",
doi = "10.1109/DAC.2007.375274",
language = "English (US)",
isbn = "1595936270",
pages = "801--804",
booktitle = "Proceedings - Design Automation Conference",

}

TY - GEN

T1 - Approximation algorithm for data mapping on block multi-threaded network processor architectures

AU - Ostler, Chris

AU - Chatha, Karam S.

PY - 2007

Y1 - 2007

N2 - Network processor architectures incorporate block multi-threading to alleviate the performance degradation due to memory access latencies. Application design on such architectures requires the determination of the number of threads, and mapping of data items to the various memory elements such that the overall throughput is maximized. The paper presents a quasi-polynomial time approximation algorithm for the multi-threading aware data mapping problem which can be shown to be NP complete. The algorithm generates solutions with throughput no less than 1/2(1+ε) of optimal and data memory requirements no more than (1 + ε) times the memory constraints. Experimental results obtained by mapping applications on the Intel IXP 2400 network processor demonstrate that the algorithm is able to generate solutions whose throughput is within 80% of the optimal when ε = 0.5.

AB - Network processor architectures incorporate block multi-threading to alleviate the performance degradation due to memory access latencies. Application design on such architectures requires the determination of the number of threads, and mapping of data items to the various memory elements such that the overall throughput is maximized. The paper presents a quasi-polynomial time approximation algorithm for the multi-threading aware data mapping problem which can be shown to be NP complete. The algorithm generates solutions with throughput no less than 1/2(1+ε) of optimal and data memory requirements no more than (1 + ε) times the memory constraints. Experimental results obtained by mapping applications on the Intel IXP 2400 network processor demonstrate that the algorithm is able to generate solutions whose throughput is within 80% of the optimal when ε = 0.5.

KW - Block multi-threading

KW - Network processing

UR - http://www.scopus.com/inward/record.url?scp=34547370476&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547370476&partnerID=8YFLogxK

U2 - 10.1109/DAC.2007.375274

DO - 10.1109/DAC.2007.375274

M3 - Conference contribution

AN - SCOPUS:34547370476

SN - 1595936270

SN - 9781595936271

SP - 801

EP - 804

BT - Proceedings - Design Automation Conference

ER -