On compiling array expressions for Efficient execution on distributed-memory machines

Sandeep Gupta, S. D. Kaushik, S. Mufti, S. Sharma, C. H. Huang, P. Sadayappan

Research output: Contribution to journalConference article

29 Citations (Scopus)

Abstract

Efficient generation of communication sets and local index sets is important for evaluation of array expressions in scientific languages such as Fortran-90 and High Performance Fortran implemented on distributed-memory machines. We show that for arrays affinely aligned with templates that are distributed on multiple processors with a block-cyclic distribution, the local memory access sequence and communication sets can be efficiently enumerated using closed forms. First, closed form solutions are presented for arrays that are aligned with identity templates that are distributed using block or cyclic distributions. These closed forms are then used with a uirtual processor approach to give an efficient solution for arrays with block-cyclic distributions. These results are extended to arrays affinely aligned to arbitrary templates that have regular distributions. We present performance results on an iPSC/860 processor, that demonstrate the low runtime overhead of this scheme.

Original languageEnglish (US)
Article number4134228
Pages (from-to)301-305
Number of pages5
JournalProceedings of the International Conference on Parallel Processing
Volume2
DOIs
StatePublished - Jan 1 1993
Externally publishedYes
Event1993 International Conference on Parallel Processing, ICPP 1993 - Syracuse, United States
Duration: Aug 16 1993Aug 20 1993

Fingerprint

Distributed Memory
Data storage equipment
Communication
Template
Closed-form
Efficient Solution
Closed-form Solution
High Performance
Evaluation
Arbitrary
Demonstrate

ASJC Scopus subject areas

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Cite this

On compiling array expressions for Efficient execution on distributed-memory machines. / Gupta, Sandeep; Kaushik, S. D.; Mufti, S.; Sharma, S.; Huang, C. H.; Sadayappan, P.

In: Proceedings of the International Conference on Parallel Processing, Vol. 2, 4134228, 01.01.1993, p. 301-305.

Research output: Contribution to journalConference article

Gupta, Sandeep ; Kaushik, S. D. ; Mufti, S. ; Sharma, S. ; Huang, C. H. ; Sadayappan, P. / On compiling array expressions for Efficient execution on distributed-memory machines. In: Proceedings of the International Conference on Parallel Processing. 1993 ; Vol. 2. pp. 301-305.
@article{d58b25dbad3e48f1a488cec5adb7490b,
title = "On compiling array expressions for Efficient execution on distributed-memory machines",
abstract = "Efficient generation of communication sets and local index sets is important for evaluation of array expressions in scientific languages such as Fortran-90 and High Performance Fortran implemented on distributed-memory machines. We show that for arrays affinely aligned with templates that are distributed on multiple processors with a block-cyclic distribution, the local memory access sequence and communication sets can be efficiently enumerated using closed forms. First, closed form solutions are presented for arrays that are aligned with identity templates that are distributed using block or cyclic distributions. These closed forms are then used with a uirtual processor approach to give an efficient solution for arrays with block-cyclic distributions. These results are extended to arrays affinely aligned to arbitrary templates that have regular distributions. We present performance results on an iPSC/860 processor, that demonstrate the low runtime overhead of this scheme.",
author = "Sandeep Gupta and Kaushik, {S. D.} and S. Mufti and S. Sharma and Huang, {C. H.} and P. Sadayappan",
year = "1993",
month = "1",
day = "1",
doi = "10.1109/ICPP.1993.171",
language = "English (US)",
volume = "2",
pages = "301--305",
journal = "Proceedings of the International Conference on Parallel Processing",
issn = "0190-3918",

}

TY - JOUR

T1 - On compiling array expressions for Efficient execution on distributed-memory machines

AU - Gupta, Sandeep

AU - Kaushik, S. D.

AU - Mufti, S.

AU - Sharma, S.

AU - Huang, C. H.

AU - Sadayappan, P.

PY - 1993/1/1

Y1 - 1993/1/1

N2 - Efficient generation of communication sets and local index sets is important for evaluation of array expressions in scientific languages such as Fortran-90 and High Performance Fortran implemented on distributed-memory machines. We show that for arrays affinely aligned with templates that are distributed on multiple processors with a block-cyclic distribution, the local memory access sequence and communication sets can be efficiently enumerated using closed forms. First, closed form solutions are presented for arrays that are aligned with identity templates that are distributed using block or cyclic distributions. These closed forms are then used with a uirtual processor approach to give an efficient solution for arrays with block-cyclic distributions. These results are extended to arrays affinely aligned to arbitrary templates that have regular distributions. We present performance results on an iPSC/860 processor, that demonstrate the low runtime overhead of this scheme.

AB - Efficient generation of communication sets and local index sets is important for evaluation of array expressions in scientific languages such as Fortran-90 and High Performance Fortran implemented on distributed-memory machines. We show that for arrays affinely aligned with templates that are distributed on multiple processors with a block-cyclic distribution, the local memory access sequence and communication sets can be efficiently enumerated using closed forms. First, closed form solutions are presented for arrays that are aligned with identity templates that are distributed using block or cyclic distributions. These closed forms are then used with a uirtual processor approach to give an efficient solution for arrays with block-cyclic distributions. These results are extended to arrays affinely aligned to arbitrary templates that have regular distributions. We present performance results on an iPSC/860 processor, that demonstrate the low runtime overhead of this scheme.

UR - http://www.scopus.com/inward/record.url?scp=84956696333&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84956696333&partnerID=8YFLogxK

U2 - 10.1109/ICPP.1993.171

DO - 10.1109/ICPP.1993.171

M3 - Conference article

AN - SCOPUS:84956696333

VL - 2

SP - 301

EP - 305

JO - Proceedings of the International Conference on Parallel Processing

JF - Proceedings of the International Conference on Parallel Processing

SN - 0190-3918

M1 - 4134228

ER -