TY - GEN
T1 - Exploiting residue number system for power-efficient digital signal processing in embedded processors
AU - Chokshi, Rooju
AU - Berezowski, Krzysztof S.
AU - Shrivastava, Aviral
AU - Piestrak, Stanislaw J.
PY - 2009
Y1 - 2009
N2 - 2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propagation. Residue Number System (RNS) breaks free of these bonds by decomposing a number into parts and performing arithmetic operations in parallel, significantly reducing the breadth of carry propagation. Consequently, RNS arithmetic has been proposed as a solution to improve the power-efficiency of arithmetic hardware. However, limitations of the expressiveness of RNS in terms of arithmetic operations together with overheads related to interaction with 2's complement arithmetic make programmable processor design that takes advantage of these benefits challenging. In this paper we meet this challenge by multi-tier synergistic co-design of architecture, micro-architecture, hardware components, as well as compilation techniques. Our experiments not only demonstrate simultaneous improvement of up to 30% in performance and 57% reduction in functional unit power consumption, but also that most of these benefits can be exploited with automatically generated code.
AB - 2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propagation. Residue Number System (RNS) breaks free of these bonds by decomposing a number into parts and performing arithmetic operations in parallel, significantly reducing the breadth of carry propagation. Consequently, RNS arithmetic has been proposed as a solution to improve the power-efficiency of arithmetic hardware. However, limitations of the expressiveness of RNS in terms of arithmetic operations together with overheads related to interaction with 2's complement arithmetic make programmable processor design that takes advantage of these benefits challenging. In this paper we meet this challenge by multi-tier synergistic co-design of architecture, micro-architecture, hardware components, as well as compilation techniques. Our experiments not only demonstrate simultaneous improvement of up to 30% in performance and 57% reduction in functional unit power consumption, but also that most of these benefits can be exploited with automatically generated code.
KW - Compiler
KW - Performance
KW - Power
KW - Processor
KW - Residue Number System
UR - http://www.scopus.com/inward/record.url?scp=72049113126&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72049113126&partnerID=8YFLogxK
U2 - 10.1145/1629395.1629401
DO - 10.1145/1629395.1629401
M3 - Conference contribution
AN - SCOPUS:72049113126
SN - 9781605586267
T3 - Embedded Systems Week 2009 - 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES'09
SP - 19
EP - 27
BT - Embedded Systems Week 2009 - 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES'09
T2 - Embedded Systems Week 2009, ESWEEK 2009 - 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES'09
Y2 - 11 October 2009 through 16 October 2009
ER -