Towards integrative gene prioritization in Alzheimer's disease

Jang H. Lee, Graciela H. Gonzalez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

Original languageEnglish (US)
Title of host publicationPacific Symposium on Biocomputing 2011, PSB 2011
Pages4-13
Number of pages10
StatePublished - 2011
Event16th Pacific Symposium on Biocomputing, PSB 2011 - Kohala Coast, HI, United States
Duration: Jan 3 2011Jan 7 2011

Other

Other16th Pacific Symposium on Biocomputing, PSB 2011
CountryUnited States
CityKohala Coast, HI
Period1/3/111/7/11

Fingerprint

Alzheimer Disease
Genes
Information Storage and Retrieval
Molecular Sequence Annotation
Bearings (structural)
Gene Ontology
Proteins
Data integration
Noise
Pathology
Gene expression
Genome
Ontology
Assays
Gene Expression
Association reactions
Testing

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Biomedical Engineering
  • Medicine(all)

Cite this

Lee, J. H., & Gonzalez, G. H. (2011). Towards integrative gene prioritization in Alzheimer's disease. In Pacific Symposium on Biocomputing 2011, PSB 2011 (pp. 4-13)

Towards integrative gene prioritization in Alzheimer's disease. / Lee, Jang H.; Gonzalez, Graciela H.

Pacific Symposium on Biocomputing 2011, PSB 2011. 2011. p. 4-13.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lee, JH & Gonzalez, GH 2011, Towards integrative gene prioritization in Alzheimer's disease. in Pacific Symposium on Biocomputing 2011, PSB 2011. pp. 4-13, 16th Pacific Symposium on Biocomputing, PSB 2011, Kohala Coast, HI, United States, 1/3/11.
Lee JH, Gonzalez GH. Towards integrative gene prioritization in Alzheimer's disease. In Pacific Symposium on Biocomputing 2011, PSB 2011. 2011. p. 4-13
Lee, Jang H. ; Gonzalez, Graciela H. / Towards integrative gene prioritization in Alzheimer's disease. Pacific Symposium on Biocomputing 2011, PSB 2011. 2011. pp. 4-13
@inproceedings{a095e95f8cf7472fb6a350a30978e2ae,
title = "Towards integrative gene prioritization in Alzheimer's disease",
abstract = "Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.",
author = "Lee, {Jang H.} and Gonzalez, {Graciela H.}",
year = "2011",
language = "English (US)",
isbn = "9814335053",
pages = "4--13",
booktitle = "Pacific Symposium on Biocomputing 2011, PSB 2011",

}

TY - GEN

T1 - Towards integrative gene prioritization in Alzheimer's disease

AU - Lee, Jang H.

AU - Gonzalez, Graciela H.

PY - 2011

Y1 - 2011

N2 - Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

AB - Many methods have been proposed for facilitating the uncovering of genes that underlie the pathology of different diseases. Some are purely statistical, resulting in a (mostly) undifferentiated set of genes that are differentially expressed (or co-expressed), while others seek to prioritize the resulting set of genes through comparison against specific known targets. Most of the recent approaches use either single data or knowledge sources, or combine the independent predictions from each source. However, given that multiple kinds of heterogeneous sources are potentially relevant for gene prioritization, each subject to different levels of noise and of varying reliability, each source bearing information not carried by another, we claim that an ideal prioritization method should provide ways to discern amongst them in a true integrative fashion that captures the subtleties of each, rather than using a simple combination of sources. Integration of multiple data for gene prioritization is thus more challenging than its single data type counterpart. What we propose is a novel, general, and flexible formulation that enables multi-source data integration for gene prioritization that maximizes the complementary nature of different data and knowledge sources in order to make the most use of the information content of aggregate data. Protein-protein interactions and Gene Ontology annotations were used as knowledge sources, together with assay-specific gene expression and genome-wide association data. Leave-one-out testing was performed using a known set of Alzheimer's Disease genes to validate our proposed method. We show that our proposed method performs better than the best multi-source gene prioritization systems currently published.

UR - http://www.scopus.com/inward/record.url?scp=84863142782&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863142782&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9814335053

SN - 9789814335058

SP - 4

EP - 13

BT - Pacific Symposium on Biocomputing 2011, PSB 2011

ER -