TY - GEN
T1 - Exploring DRAM organizations for energy-efficient and resilient exascale memories
AU - Giridhar, Bharan
AU - Cieslak, Michael
AU - Duggal, Deepankar
AU - Dreslinski, Ronald
AU - Chen, Hsing Min
AU - Patti, Robert
AU - Hold, Betina
AU - Chakrabarti, Chaitali
AU - Mudge, Trevor
AU - Blaauw, David
PY - 2013
Y1 - 2013
N2 - The power target for exascale supercomputing is 20MW, with about 30% budgeted for the memory subsystem. Commodity DRAMs will not satisfy this requirement. Additionally, the large number of memory chips (>10M) required will result in crippling failure rates. Although specialized DRAM memories have been reorganized to reduce power through 3D-stacking or row buffer resizing, their implications on fault tolerance have not been considered. We show that addressing reliability and energy is a co-optimization problem involving tradeofis between error correction cost, access energy and refresh power|reducing the physical page size to decrease access energy increases the energy/area over-head of error resilience. Additionally, power can be reduced by optimizing bitline lengths. The proposed 3D-stacked memory uses a page size of 4kb and consumes 5.1pJ/bit based on simulations with NEK5000 benchmarks. Scaling to 100PB, the memory consumes 4.7MW at 100PB/s which, while well within the total power budget (20MW), is also error-resilient.
AB - The power target for exascale supercomputing is 20MW, with about 30% budgeted for the memory subsystem. Commodity DRAMs will not satisfy this requirement. Additionally, the large number of memory chips (>10M) required will result in crippling failure rates. Although specialized DRAM memories have been reorganized to reduce power through 3D-stacking or row buffer resizing, their implications on fault tolerance have not been considered. We show that addressing reliability and energy is a co-optimization problem involving tradeofis between error correction cost, access energy and refresh power|reducing the physical page size to decrease access energy increases the energy/area over-head of error resilience. Additionally, power can be reduced by optimizing bitline lengths. The proposed 3D-stacked memory uses a page size of 4kb and consumes 5.1pJ/bit based on simulations with NEK5000 benchmarks. Scaling to 100PB, the memory consumes 4.7MW at 100PB/s which, while well within the total power budget (20MW), is also error-resilient.
UR - http://www.scopus.com/inward/record.url?scp=84899667235&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899667235&partnerID=8YFLogxK
U2 - 10.1145/2503210.2503215
DO - 10.1145/2503210.2503215
M3 - Conference contribution
AN - SCOPUS:84899667235
SN - 9781450323789
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2013
PB - IEEE Computer Society
T2 - 2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
Y2 - 17 November 2013 through 22 November 2013
ER -