Abstract
Identifying logic relationships between proteins is essential for understanding their function within cells. Previous studies have been done to infer protein logic relationships using pairwise and triplet logic analysis on phylogenetic profiles. Other computational methods have also been developed using pairwise analysis on Rosetta Stone data to infer protein functional linkages. (Proteins that share the same metabolic pathway or a common structural complex are said to be functionally linked.) This paper describes a Bayesian modeling framework for combining phylogenetic profile data via a likelihood with Rosetta Stone data via a prior probability. Based on the proposed framework, a general method is developed for jointly learning high-order logic relationships among proteins whose presence or absence can be identified by logic functions. The method is applied to analyze protein triplets and quartets on phylogenetic profile and Rosetta Stone data sets with 140 clusters of orthologous genes (COGs). The biological meaning of the top 30 significant triplets are further verified using the KEGG and NCBI databases. Over 50% of the discovered relationships that are associated with high significant scores could not be inferred using phylogenetic profile or Rosetta Stone data alone. The statistical analysis in this paper shows that all significant quartets have p-values ≤5.71E-04. Many of them assign putative functional roles on uncharacterized proteins.
Original language | English (US) |
---|---|
Pages (from-to) | 2427-2435 |
Number of pages | 9 |
Journal | IEEE Transactions on Signal Processing |
Volume | 54 |
Issue number | 6 II |
DOIs | |
State | Published - Jun 2006 |
Keywords
- Phylogenetic profiles
- Protein logic relationships
- Rosetta Stone method
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering