TY - GEN
T1 - Customizing wide-SIMD architectures for H.264
AU - Seo, S.
AU - Woh, M.
AU - Mahlke, S.
AU - Mudge, T.
AU - Vijay, S.
AU - Chakrabarti, Chaitali
PY - 2009
Y1 - 2009
N2 - In recent years, the mobile phone industry has become one of the most dynamic technology sectors. The increasing demands of multimedia services on the cellular networks have accelerated this trend. This paper presents a low power SIMD architecture that has been tailored for efficient implementation of H.264 encoder/decoder kernel algorithms. Several customized features have been added to improve the processing performance and lower the power consumption. These include support for different SIMD widths to increase the SIMD utilization efficiency, diagonal memory organization to support both column and row access, temporary buffer and bypass support to reduce the register file power consumption, fused operation support to increase the processing performance, and a fast programmable crossbar to support complex data permutation patterns. The proposed architecture increases the throughput of H.264 encoder/decoder kernel algorithms by a factor of 2.13 while achieving 29% of energy-delay improvement on average compared to our previous SIMD architecture, SODA.
AB - In recent years, the mobile phone industry has become one of the most dynamic technology sectors. The increasing demands of multimedia services on the cellular networks have accelerated this trend. This paper presents a low power SIMD architecture that has been tailored for efficient implementation of H.264 encoder/decoder kernel algorithms. Several customized features have been added to improve the processing performance and lower the power consumption. These include support for different SIMD widths to increase the SIMD utilization efficiency, diagonal memory organization to support both column and row access, temporary buffer and bypass support to reduce the register file power consumption, fused operation support to increase the processing performance, and a fast programmable crossbar to support complex data permutation patterns. The proposed architecture increases the throughput of H.264 encoder/decoder kernel algorithms by a factor of 2.13 while achieving 29% of energy-delay improvement on average compared to our previous SIMD architecture, SODA.
UR - http://www.scopus.com/inward/record.url?scp=72049116292&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72049116292&partnerID=8YFLogxK
U2 - 10.1109/ICSAMOS.2009.5289229
DO - 10.1109/ICSAMOS.2009.5289229
M3 - Conference contribution
AN - SCOPUS:72049116292
SN - 9781424445011
T3 - Proceedings - 2009 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2009
SP - 172
EP - 179
BT - Proceedings - 2009 International Conference on Embedded Computer Systems
T2 - 2009 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2009
Y2 - 20 July 2009 through 23 July 2009
ER -