Abstract
Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.
Original language | English (US) |
---|---|
Pages (from-to) | 228-240 |
Number of pages | 13 |
Journal | Proteins: Structure, Function and Genetics |
Volume | 73 |
Issue number | 1 |
DOIs | |
State | Published - Oct 2008 |
Externally published | Yes |
Keywords
- Conditional random fields (CRFs)
- Discriminative learning
- Protein conformation sampling
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology