Discriminative learning for protein conformation sampling

Feng Zhao, Shuaicheng Li, Beckett Sterner, Jinbo Xu

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.

Original languageEnglish (US)
Pages (from-to)228-240
Number of pages13
JournalProteins: Structure, Function and Genetics
Volume73
Issue number1
DOIs
StatePublished - Oct 2008
Externally publishedYes

Fingerprint

Protein Conformation
Conformations
Learning
Sampling
Statistical Models
Proteins
Parameter estimation
Topology

Keywords

  • Conditional random fields (CRFs)
  • Discriminative learning
  • Protein conformation sampling

ASJC Scopus subject areas

  • Genetics
  • Structural Biology
  • Biochemistry

Cite this

Discriminative learning for protein conformation sampling. / Zhao, Feng; Li, Shuaicheng; Sterner, Beckett; Xu, Jinbo.

In: Proteins: Structure, Function and Genetics, Vol. 73, No. 1, 10.2008, p. 228-240.

Research output: Contribution to journalArticle

Zhao, Feng ; Li, Shuaicheng ; Sterner, Beckett ; Xu, Jinbo. / Discriminative learning for protein conformation sampling. In: Proteins: Structure, Function and Genetics. 2008 ; Vol. 73, No. 1. pp. 228-240.
@article{43b47cbeab3f4d85a394861c1c37bf56,
title = "Discriminative learning for protein conformation sampling",
abstract = "Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.",
keywords = "Conditional random fields (CRFs), Discriminative learning, Protein conformation sampling",
author = "Feng Zhao and Shuaicheng Li and Beckett Sterner and Jinbo Xu",
year = "2008",
month = "10",
doi = "10.1002/prot.22057",
language = "English (US)",
volume = "73",
pages = "228--240",
journal = "Proteins: Structure, Function and Bioinformatics",
issn = "0887-3585",
publisher = "Wiley-Liss Inc.",
number = "1",

}

TY - JOUR

T1 - Discriminative learning for protein conformation sampling

AU - Zhao, Feng

AU - Li, Shuaicheng

AU - Sterner, Beckett

AU - Xu, Jinbo

PY - 2008/10

Y1 - 2008/10

N2 - Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.

AB - Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self-avoiding constraints, CRFSampler can efficiently generate protein-like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence-structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model.

KW - Conditional random fields (CRFs)

KW - Discriminative learning

KW - Protein conformation sampling

UR - http://www.scopus.com/inward/record.url?scp=50849085804&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=50849085804&partnerID=8YFLogxK

U2 - 10.1002/prot.22057

DO - 10.1002/prot.22057

M3 - Article

C2 - 18412258

AN - SCOPUS:50849085804

VL - 73

SP - 228

EP - 240

JO - Proteins: Structure, Function and Bioinformatics

JF - Proteins: Structure, Function and Bioinformatics

SN - 0887-3585

IS - 1

ER -