NGSPE: A pipeline for end-to-end analysis of DNA sequencing data and comparison between different platforms

Ke Huang, Venkata Yellapantula, Leslie Baier, Valentin Dinu

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

We present NGSPE, a pipeline for variation discovery and genotyping of pair-ended Illumina next generation sequencing (NGS) data (http://ngspeanalysis.sourceforge.net/). This pipeline not only describes a set of sequential analytical steps, such as short reads alignment, genotype calling and functional variation annotation that can be conducted using open-source software tools, but also provides users a set of scripts to install the dependent software and resources and implement the pipeline on their data. A sample summary report including the concordance rate between data generated by this pipeline and different resources as well as the comparison between replication samples of two commercial platforms from Illumina and Complete Genomics is also provided. Furthermore, some of the mutations identified by the pipeline were verified using Sanger sequencing.

Original languageEnglish (US)
Pages (from-to)1171-1176
Number of pages6
JournalComputers in Biology and Medicine
Volume43
Issue number9
DOIs
StatePublished - Sep 1 2013

Fingerprint

DNA Sequence Analysis
DNA
Software
Pipelines
Genomics
Genotype
Mutation

Keywords

  • Alignment
  • Annotation
  • Data analysis
  • DNA
  • Genotype calling
  • Next generation sequencing

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

NGSPE : A pipeline for end-to-end analysis of DNA sequencing data and comparison between different platforms. / Huang, Ke; Yellapantula, Venkata; Baier, Leslie; Dinu, Valentin.

In: Computers in Biology and Medicine, Vol. 43, No. 9, 01.09.2013, p. 1171-1176.

Research output: Contribution to journalArticle

@article{4d063875bdc04c1d958472bbea9a1433,
title = "NGSPE: A pipeline for end-to-end analysis of DNA sequencing data and comparison between different platforms",
abstract = "We present NGSPE, a pipeline for variation discovery and genotyping of pair-ended Illumina next generation sequencing (NGS) data (http://ngspeanalysis.sourceforge.net/). This pipeline not only describes a set of sequential analytical steps, such as short reads alignment, genotype calling and functional variation annotation that can be conducted using open-source software tools, but also provides users a set of scripts to install the dependent software and resources and implement the pipeline on their data. A sample summary report including the concordance rate between data generated by this pipeline and different resources as well as the comparison between replication samples of two commercial platforms from Illumina and Complete Genomics is also provided. Furthermore, some of the mutations identified by the pipeline were verified using Sanger sequencing.",
keywords = "Alignment, Annotation, Data analysis, DNA, Genotype calling, Next generation sequencing",
author = "Ke Huang and Venkata Yellapantula and Leslie Baier and Valentin Dinu",
year = "2013",
month = "9",
day = "1",
doi = "10.1016/j.compbiomed.2013.05.025",
language = "English (US)",
volume = "43",
pages = "1171--1176",
journal = "Computers in Biology and Medicine",
issn = "0010-4825",
publisher = "Elsevier Limited",
number = "9",

}

TY - JOUR

T1 - NGSPE

T2 - A pipeline for end-to-end analysis of DNA sequencing data and comparison between different platforms

AU - Huang, Ke

AU - Yellapantula, Venkata

AU - Baier, Leslie

AU - Dinu, Valentin

PY - 2013/9/1

Y1 - 2013/9/1

N2 - We present NGSPE, a pipeline for variation discovery and genotyping of pair-ended Illumina next generation sequencing (NGS) data (http://ngspeanalysis.sourceforge.net/). This pipeline not only describes a set of sequential analytical steps, such as short reads alignment, genotype calling and functional variation annotation that can be conducted using open-source software tools, but also provides users a set of scripts to install the dependent software and resources and implement the pipeline on their data. A sample summary report including the concordance rate between data generated by this pipeline and different resources as well as the comparison between replication samples of two commercial platforms from Illumina and Complete Genomics is also provided. Furthermore, some of the mutations identified by the pipeline were verified using Sanger sequencing.

AB - We present NGSPE, a pipeline for variation discovery and genotyping of pair-ended Illumina next generation sequencing (NGS) data (http://ngspeanalysis.sourceforge.net/). This pipeline not only describes a set of sequential analytical steps, such as short reads alignment, genotype calling and functional variation annotation that can be conducted using open-source software tools, but also provides users a set of scripts to install the dependent software and resources and implement the pipeline on their data. A sample summary report including the concordance rate between data generated by this pipeline and different resources as well as the comparison between replication samples of two commercial platforms from Illumina and Complete Genomics is also provided. Furthermore, some of the mutations identified by the pipeline were verified using Sanger sequencing.

KW - Alignment

KW - Annotation

KW - Data analysis

KW - DNA

KW - Genotype calling

KW - Next generation sequencing

UR - http://www.scopus.com/inward/record.url?scp=84880022230&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880022230&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2013.05.025

DO - 10.1016/j.compbiomed.2013.05.025

M3 - Article

C2 - 23930810

AN - SCOPUS:84880022230

VL - 43

SP - 1171

EP - 1176

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

SN - 0010-4825

IS - 9

ER -