Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach

Jinghui Zhang, Richard P. Finney, Robert J. Clifford, Leslie K. Derr, Kenneth Buetow

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

High-density oligonucleotide arrays have become a popular assay for concurrent measurement of mRNA expression at the genome scale. Much effort has been devoted to the development of statistical analysis tools aimed at reducing experimental noise and normalizing experimental variation in gene expression analysis. However, these investigations do not detect or catalog systematic problems associated with specific oligonucleotide probes. Here, we present an investigation of problematic probes that yield consistent but inaccurate signals across multiple experiments. By evaluating data integrity among gene, probe sequence, and genomic structure we identified a total of 20,696 (10.5%) nonspecific probes that could cross-hybridize to multiple genes and a total of 18,363 (9.3%) probes that miss the target transcript sequences on the Affymetrix GeneChip U95A/Av2 array. The numbers of nonspecific and mistargeted probes on the U133A array are 29,405 (12.1%) and 19,717 (8.0%), respectively. The poor performance of the mistargeted probes was confirmed in two GeneChip experiments, in which these probes showed a 20-30% decrease in detecting present signals compared with normal probes. Comparison of qualitative expression signals obtained from SAGE and EST data with those from GeneChip arrays showed that the consistency of the two platforms is 30% lower in problematic probes than in normal probes. A Web application was developed to apply our results for improving the accuracy of expression analysis.

Original languageEnglish (US)
Pages (from-to)297-308
Number of pages12
JournalGenomics
Volume85
Issue number3
DOIs
StatePublished - Mar 2005
Externally publishedYes

Fingerprint

Oligonucleotide Array Sequence Analysis
Computer Simulation
Oligonucleotide Probes
Expressed Sequence Tags
Genes
Noise
Genome
Gene Expression
Messenger RNA

Keywords

  • Affymetrix
  • Cross-hybridization
  • Expression
  • Mistargeted
  • mRNA
  • Probe

ASJC Scopus subject areas

  • Genetics

Cite this

Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach. / Zhang, Jinghui; Finney, Richard P.; Clifford, Robert J.; Derr, Leslie K.; Buetow, Kenneth.

In: Genomics, Vol. 85, No. 3, 03.2005, p. 297-308.

Research output: Contribution to journalArticle

Zhang, Jinghui ; Finney, Richard P. ; Clifford, Robert J. ; Derr, Leslie K. ; Buetow, Kenneth. / Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach. In: Genomics. 2005 ; Vol. 85, No. 3. pp. 297-308.
@article{141dc8508ab9431b99547cb2849f5e7a,
title = "Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach",
abstract = "High-density oligonucleotide arrays have become a popular assay for concurrent measurement of mRNA expression at the genome scale. Much effort has been devoted to the development of statistical analysis tools aimed at reducing experimental noise and normalizing experimental variation in gene expression analysis. However, these investigations do not detect or catalog systematic problems associated with specific oligonucleotide probes. Here, we present an investigation of problematic probes that yield consistent but inaccurate signals across multiple experiments. By evaluating data integrity among gene, probe sequence, and genomic structure we identified a total of 20,696 (10.5{\%}) nonspecific probes that could cross-hybridize to multiple genes and a total of 18,363 (9.3{\%}) probes that miss the target transcript sequences on the Affymetrix GeneChip U95A/Av2 array. The numbers of nonspecific and mistargeted probes on the U133A array are 29,405 (12.1{\%}) and 19,717 (8.0{\%}), respectively. The poor performance of the mistargeted probes was confirmed in two GeneChip experiments, in which these probes showed a 20-30{\%} decrease in detecting present signals compared with normal probes. Comparison of qualitative expression signals obtained from SAGE and EST data with those from GeneChip arrays showed that the consistency of the two platforms is 30{\%} lower in problematic probes than in normal probes. A Web application was developed to apply our results for improving the accuracy of expression analysis.",
keywords = "Affymetrix, Cross-hybridization, Expression, Mistargeted, mRNA, Probe",
author = "Jinghui Zhang and Finney, {Richard P.} and Clifford, {Robert J.} and Derr, {Leslie K.} and Kenneth Buetow",
year = "2005",
month = "3",
doi = "10.1016/j.ygeno.2004.11.004",
language = "English (US)",
volume = "85",
pages = "297--308",
journal = "Genomics",
issn = "0888-7543",
publisher = "Academic Press Inc.",
number = "3",

}

TY - JOUR

T1 - Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach

AU - Zhang, Jinghui

AU - Finney, Richard P.

AU - Clifford, Robert J.

AU - Derr, Leslie K.

AU - Buetow, Kenneth

PY - 2005/3

Y1 - 2005/3

N2 - High-density oligonucleotide arrays have become a popular assay for concurrent measurement of mRNA expression at the genome scale. Much effort has been devoted to the development of statistical analysis tools aimed at reducing experimental noise and normalizing experimental variation in gene expression analysis. However, these investigations do not detect or catalog systematic problems associated with specific oligonucleotide probes. Here, we present an investigation of problematic probes that yield consistent but inaccurate signals across multiple experiments. By evaluating data integrity among gene, probe sequence, and genomic structure we identified a total of 20,696 (10.5%) nonspecific probes that could cross-hybridize to multiple genes and a total of 18,363 (9.3%) probes that miss the target transcript sequences on the Affymetrix GeneChip U95A/Av2 array. The numbers of nonspecific and mistargeted probes on the U133A array are 29,405 (12.1%) and 19,717 (8.0%), respectively. The poor performance of the mistargeted probes was confirmed in two GeneChip experiments, in which these probes showed a 20-30% decrease in detecting present signals compared with normal probes. Comparison of qualitative expression signals obtained from SAGE and EST data with those from GeneChip arrays showed that the consistency of the two platforms is 30% lower in problematic probes than in normal probes. A Web application was developed to apply our results for improving the accuracy of expression analysis.

AB - High-density oligonucleotide arrays have become a popular assay for concurrent measurement of mRNA expression at the genome scale. Much effort has been devoted to the development of statistical analysis tools aimed at reducing experimental noise and normalizing experimental variation in gene expression analysis. However, these investigations do not detect or catalog systematic problems associated with specific oligonucleotide probes. Here, we present an investigation of problematic probes that yield consistent but inaccurate signals across multiple experiments. By evaluating data integrity among gene, probe sequence, and genomic structure we identified a total of 20,696 (10.5%) nonspecific probes that could cross-hybridize to multiple genes and a total of 18,363 (9.3%) probes that miss the target transcript sequences on the Affymetrix GeneChip U95A/Av2 array. The numbers of nonspecific and mistargeted probes on the U133A array are 29,405 (12.1%) and 19,717 (8.0%), respectively. The poor performance of the mistargeted probes was confirmed in two GeneChip experiments, in which these probes showed a 20-30% decrease in detecting present signals compared with normal probes. Comparison of qualitative expression signals obtained from SAGE and EST data with those from GeneChip arrays showed that the consistency of the two platforms is 30% lower in problematic probes than in normal probes. A Web application was developed to apply our results for improving the accuracy of expression analysis.

KW - Affymetrix

KW - Cross-hybridization

KW - Expression

KW - Mistargeted

KW - mRNA

KW - Probe

UR - http://www.scopus.com/inward/record.url?scp=13844256710&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=13844256710&partnerID=8YFLogxK

U2 - 10.1016/j.ygeno.2004.11.004

DO - 10.1016/j.ygeno.2004.11.004

M3 - Article

VL - 85

SP - 297

EP - 308

JO - Genomics

JF - Genomics

SN - 0888-7543

IS - 3

ER -