A new formulation of protein evolutionary models that account for structural constraints

Andrew J. Bordner, Hans Mittelmann

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

Despite the importance of a thermodynamically stable structure with a conserved fold for protein function, almost all evolutionary models neglect site-site correlations that arise from physical interactions between neighboring amino acid sites. This is mainly due to the difficulty in formulating a computationally tractable model since rate matrices can no longer be used. Here, we introduce a general framework, based on factor graphs, for constructing probabilistic models of protein evolution with site interdependence. Conveniently, efficient approximate inference algorithms, such as Belief Propagation, can be used to calculate likelihoods for these models. We fit an amino acid substitution model of this type that accounts for both solvent accessibility and site-site correlations. Comparisons of the new model with rate matrix models and alternative structure-dependent models demonstrate that it better fits the sequence data. We also examine evolution within a family of homohexameric enzymes and find that site-site correlations between most contacting subunits contribute to a higher likelihood. In addition, we show that the new substitution model has a similar mathematical form to the one introduced in Rodrigue et al. (Rodrigue N, Lartillot N, Bryant D, Philippe H. 2005. Site interdependence attributed to tertiary structure in amino acid sequence evolution. Gene 347:207-217), although with different parameter interpretations and values. We also perform a statistical analysis of the effects of amino acids at neighboring sites on substitution probabilities and find a significant perturbation of most probabilities, further supporting the significant role of site-site interactions in protein evolution and motivating the development of new evolutionary models similar to the one described here. Finally, we discuss possible extensions and applications of the new substitution model.

Original languageEnglish (US)
Pages (from-to)736-749
Number of pages14
JournalMolecular biology and evolution
Volume31
Issue number3
DOIs
StatePublished - Mar 2014

Keywords

  • approximate inference algorithms
  • factor graphs
  • protein evolution
  • protein structure
  • site-site correlations
  • substitution models

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'A new formulation of protein evolutionary models that account for structural constraints'. Together they form a unique fingerprint.

Cite this