Joint learning of logic relationships for studying protein function using phylogenetic profiles and the Rosetta Stone method

Xin Zhang, Seungchan Kim, Tie Wang, Chitta Baral

Research output: Contribution to journalArticle

10 Scopus citations


Identifying logic relationships between proteins is essential for understanding their function within cells. Previous studies have been done to infer protein logic relationships using pairwise and triplet logic analysis on phylogenetic profiles. Other computational methods have also been developed using pairwise analysis on Rosetta Stone data to infer protein functional linkages. (Proteins that share the same metabolic pathway or a common structural complex are said to be functionally linked.) This paper describes a Bayesian modeling framework for combining phylogenetic profile data via a likelihood with Rosetta Stone data via a prior probability. Based on the proposed framework, a general method is developed for jointly learning high-order logic relationships among proteins whose presence or absence can be identified by logic functions. The method is applied to analyze protein triplets and quartets on phylogenetic profile and Rosetta Stone data sets with 140 clusters of orthologous genes (COGs). The biological meaning of the top 30 significant triplets are further verified using the KEGG and NCBI databases. Over 50% of the discovered relationships that are associated with high significant scores could not be inferred using phylogenetic profile or Rosetta Stone data alone. The statistical analysis in this paper shows that all significant quartets have p-values ≤5.71E-04. Many of them assign putative functional roles on uncharacterized proteins.

Original languageEnglish (US)
Pages (from-to)2427-2435
Number of pages9
JournalIEEE Transactions on Signal Processing
Issue number6 II
Publication statusPublished - Jun 2006



  • Phylogenetic profiles
  • Protein logic relationships
  • Rosetta Stone method

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing

Cite this