Reliable identification of large numbers of candidate SNPs from public EST data

Kenneth H. Buetow; Michael N. Edmonson; Anna B. Cassidy

doi:10.1038/6851

Reliable identification of large numbers of candidate SNPs from public EST data

Kenneth H. Buetow, Michael N. Edmonson, Anna B. Cassidy

Research output: Contribution to journal › Article › peer-review

224 Scopus citations

Abstract

High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi- automated analysis required for large-scale genomes (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single- nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.

Original language	English (US)
Pages (from-to)	323-325
Number of pages	3
Journal	Nature Genetics
Volume	21
Issue number	3
DOIs	https://doi.org/10.1038/6851
State	Published - Mar 1999
Externally published	Yes

ASJC Scopus subject areas

Genetics

Access to Document

10.1038/6851

Cite this

@article{88fb8f704be04967a23f7b661302d041,

title = "Reliable identification of large numbers of candidate SNPs from public EST data",

abstract = "High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi- automated analysis required for large-scale genomes (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single- nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.",

author = "Buetow, {Kenneth H.} and Edmonson, {Michael N.} and Cassidy, {Anna B.}",

year = "1999",

month = mar,

doi = "10.1038/6851",

language = "English (US)",

volume = "21",

pages = "323--325",

journal = "Nature Genetics",

issn = "1061-4036",

publisher = "Nature Publishing Group",

number = "3",

}

TY - JOUR

T1 - Reliable identification of large numbers of candidate SNPs from public EST data

AU - Buetow, Kenneth H.

AU - Edmonson, Michael N.

AU - Cassidy, Anna B.

PY - 1999/3

Y1 - 1999/3

N2 - High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi- automated analysis required for large-scale genomes (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single- nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.

AB - High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi- automated analysis required for large-scale genomes (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single- nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.

UR - http://www.scopus.com/inward/record.url?scp=0033018816&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033018816&partnerID=8YFLogxK

U2 - 10.1038/6851

DO - 10.1038/6851

M3 - Article

C2 - 10080189

AN - SCOPUS:0033018816

SN - 1061-4036

VL - 21

SP - 323

EP - 325

JO - Nature Genetics

JF - Nature Genetics

IS - 3

ER -

Reliable identification of large numbers of candidate SNPs from public EST data

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this