TY - GEN
T1 - Static analysis of processor stall cycle aggregation
AU - Lee, Jongeun
AU - Shrivastava, Aviral
PY - 2008/12/1
Y1 - 2008/12/1
N2 - Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the processor is switched to low-power mode in it. We extend the previous proposed approach in two dimensions, i) We develop static analysis for the PICA technique and present optimum parameters for five common types of loops based on steady-state analysis, ii) We show that software only control is unable to guarantee its correctness in a varying runtime environment, potentially causing deadlocks. We enhance the robustness of PICA with minimal hardware extension, ensuring correct execution for any loops and parameters, which greatly facilitates exploration based parameter optimization. The combined use of our static analysis and exploration based fine-tuning makes the PICA technique applicable, to any memory-bound loop, with energy reduction. We validate our analytical models against simulation based optimization and also show through our experiments on embedded application benchmarks, that our technique can be applied to a wide range of loops with average 20% energy reductions compared to executions without PICA.
AB - Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the processor is switched to low-power mode in it. We extend the previous proposed approach in two dimensions, i) We develop static analysis for the PICA technique and present optimum parameters for five common types of loops based on steady-state analysis, ii) We show that software only control is unable to guarantee its correctness in a varying runtime environment, potentially causing deadlocks. We enhance the robustness of PICA with minimal hardware extension, ensuring correct execution for any loops and parameters, which greatly facilitates exploration based parameter optimization. The combined use of our static analysis and exploration based fine-tuning makes the PICA technique applicable, to any memory-bound loop, with energy reduction. We validate our analytical models against simulation based optimization and also show through our experiments on embedded application benchmarks, that our technique can be applied to a wide range of loops with average 20% energy reductions compared to executions without PICA.
KW - Code transformation
KW - Embedded systems
KW - Low power
KW - Memory bound loops
KW - Processor free time
KW - Stall cycle aggregation
UR - http://www.scopus.com/inward/record.url?scp=78650760534&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650760534&partnerID=8YFLogxK
U2 - 10.1145/1450135.1450143
DO - 10.1145/1450135.1450143
M3 - Conference contribution
AN - SCOPUS:78650760534
SN - 9781605584706
T3 - Embedded Systems Week 2008 - Proceedings of the 6th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2008
SP - 25
EP - 30
BT - Embedded Systems Week 2008 - Proceedings of the 6th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2008
T2 - Embedded Systems Week 2008 - 6th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2008
Y2 - 19 October 2008 through 24 October 2008
ER -