Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis

Valentin Dinu, Hongyu Zhao, Perry L. Miller

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Genome-wide association studies can help identify multi-gene contributions to disease. As the number of high-density genomic markers tested increases, however, so does the number of loci associated with disease by chance. Performing a brute-force test for the interaction of four or more high-density genomic loci is unfeasible given the current computational limitations. Heuristics must be employed to limit the number of statistical tests performed. In this paper we explore the use of biological domain knowledge to supplement statistical analysis and data mining methods to identify genes and pathways associated with disease. We describe Pathway/SNP, a software application designed to help evaluate the association between pathways and disease. Pathway/SNP integrates domain knowledge-SNP, gene and pathway annotation from multiple sources-with statistical and data mining algorithms into a tool that can be used to explore the etiology of complex diseases.

Original languageEnglish (US)
Pages (from-to)750-760
Number of pages11
JournalJournal of Biomedical Informatics
Volume40
Issue number6
DOIs
StatePublished - Dec 2007

Keywords

  • Data integration
  • Data mining
  • False discovery rate (FDR)
  • Genome-wide association (GWA)
  • Pathway-based disease association
  • Single nucleotide polymorphisms (SNP)

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Fingerprint Dive into the research topics of 'Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis'. Together they form a unique fingerprint.

Cite this