TY - JOUR
T1 - From next-generation resequencing reads to a high-quality variant data set
AU - Pfeifer, Susanne
N1 - Publisher Copyright:
© 2017 Macmillan Publishers Limited, part of Springer Nature.
PY - 2017/2/1
Y1 - 2017/2/1
N2 - Sequencing has revolutionized biology by permitting the analysis of genomic variation at an unprecedented resolution. High-throughput sequencing is fast and inexpensive, making it accessible for a wide range of research topics. However, the produced data contain subtle but complex types of errors, biases and uncertainties that impose several statistical and computational challenges to the reliable detection of variants. To tap the full potential of high-throughput sequencing, a thorough understanding of the data produced as well as the available methodologies is required. Here, I review several commonly used methods for generating and processing next-generation resequencing data, discuss the influence of errors and biases together with their resulting implications for downstream analyses and provide general guidelines and recommendations for producing high-quality single-nucleotide polymorphism data sets from raw reads by highlighting several sophisticated reference-based methods representing the current state of the art.
AB - Sequencing has revolutionized biology by permitting the analysis of genomic variation at an unprecedented resolution. High-throughput sequencing is fast and inexpensive, making it accessible for a wide range of research topics. However, the produced data contain subtle but complex types of errors, biases and uncertainties that impose several statistical and computational challenges to the reliable detection of variants. To tap the full potential of high-throughput sequencing, a thorough understanding of the data produced as well as the available methodologies is required. Here, I review several commonly used methods for generating and processing next-generation resequencing data, discuss the influence of errors and biases together with their resulting implications for downstream analyses and provide general guidelines and recommendations for producing high-quality single-nucleotide polymorphism data sets from raw reads by highlighting several sophisticated reference-based methods representing the current state of the art.
UR - http://www.scopus.com/inward/record.url?scp=84991585337&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991585337&partnerID=8YFLogxK
U2 - 10.1038/hdy.2016.102
DO - 10.1038/hdy.2016.102
M3 - Review article
C2 - 27759079
AN - SCOPUS:84991585337
SN - 0018-067X
VL - 118
SP - 111
EP - 124
JO - Heredity
JF - Heredity
IS - 2
ER -