Parallelization techniques for implementing trellis algorithms on graphics processors

Q. Zheng, Y. Chen, R. Dreslinski, Chaitali Chakrabarti, A. Anastasopoulos, S. Mahlke, T. Mudge

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In this paper, we study different schemes to parallelize trellis algorithms for efficient implementation on a GPU. We consider parallelization schemes at the packet-level, subblock-level and trellis-level to increase the number of threads in a GPU implementation. At the trellis-level, we consider state-level, forward-backward traversal and branch-metric parallelism. To evaluate the performance of the different schemes, an LTE uplink Turbo decoder is implemented on an NVIDIA GTX470 GPU. Tradeoffs between throughput, latency and bit error rate are presented. Our most balanced configuration is simultaneously processing multiple subblocks in a packet in conjunction with recovery schemes and trellis-level parallelism, which can achieve a throughput of 19.65 Mbps with a latency of 0.56 ms at bit error rate of 10-5 for 1.3 dB channel SNR. We also show how different combinations of parallelization schemes can be used to satisfy systems with widely varying requirements of throughput, latency and bit error rate.

Original languageEnglish (US)
Title of host publication2013 IEEE International Symposium on Circuits and Systems, ISCAS 2013
Pages1220-1223
Number of pages4
DOIs
StatePublished - 2013
Event2013 IEEE International Symposium on Circuits and Systems, ISCAS 2013 - Beijing, China
Duration: May 19 2013May 23 2013

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
ISSN (Print)0271-4310

Other

Other2013 IEEE International Symposium on Circuits and Systems, ISCAS 2013
Country/TerritoryChina
CityBeijing
Period5/19/135/23/13

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Parallelization techniques for implementing trellis algorithms on graphics processors'. Together they form a unique fingerprint.

Cite this