Project Summary Even higher performance computing is needed to solve some of the most important problems that mankind faces today. To achieve higher performance, we must increase the power-efficiency of computation. In particular, today we can compute 1016 operations per second in about 20 MW; the goal of exascale computing is to achieve the computation of 1018 operations per second while consuming only 40 MW. Thus, the goal is to improve the power-efficiency of computation by 50X. In computing systems, the memory hierarchy consumes a very significant chunk of the power consumed. The goal in this proposal is to reduce the power consumption of the memory hierarchy through the use of novel scratch pad memory (SPM) based architectures. While SPM-based multicore architectures can be extremely power-efficient -- almost an order of magnitude more than a completely cache-coherent architecture for hundreds of cores they are difficult to program. This is because the data management needs to be done in the software. Leaving this to the programmers is not a solution, as it increases their burden and reduces their productivity. This project proposes to do this code and data management of a task in a source-to-source compiler. The objective of this proposal is to develop tools and techniques (in the compiler) to manage all code and data of a task onto the scratch pad memory (SPM) of the core, so that any task can be executed efficiently on a core with SPM. As part of this project, we will develop techniques to manage i) code, ii) stack data, iii) heap data, and iv) global data of a task on a fixed size scratch pad memory (SPM) on the core. A major promise of data management in software is the immense opportunity to analyze and optimize it. Therefore, after developing the basic set of techniques, we will develop techniques to optimize the data and code management through: i) changing management scheme, ii) changing management granularity, iii) more efficient data structures for management, iv) analysis to find out when it is not necessary to perform data management, and then do not need to perform it, and v) perform minimal work in each management, and vi) efficient ways to transfer data between the SPMs and the global memory. These compiler techniques will be developed in LLVM, and demonstrated on the Intel straw man architecture simulator.
|Effective start/end date||9/1/13 → 2/28/15|
- DOE: Office of Science (OS): $50,314.00