Multiple structure alignment and consensus identification for proteins

Ivaylo Ilinkin; Jieping Ye; Ravi Janardan

doi:10.1186/1471-2105-11-71

Multiple structure alignment and consensus identification for proteins

Ivaylo Ilinkin, Jieping Ye, Ravi Janardan

Computing and Augmented Intelligence, School of (IAFSE-SCAI)

Research output: Contribution to journal › Article › peer-review

27 Scopus citations

Abstract

Background: An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.Results: Experimental results show that the algorithm converges quite rapidly and generates consensus structures that are visually similar to the input proteins. A comparison with other coordinate-based alignment algorithms (MAMMOTH and MATT) shows that the proposed algorithm is competitive in terms of speed and the sizes of the conserved regions discovered in an extensive benchmark dataset derived from the HOMSTRAD and SABmark databases.The algorithm has been implemented in C++ and can be downloaded from the project's web page. Alternatively, the algorithm can be used via a web server which makes it possible to align protein structures by uploading files from local disk or by downloading protein data from the RCSB Protein Data Bank.Conclusions: An algorithm is presented to compute a multiple structure alignment for a set of proteins, together with their consensus structure. Experimental results show its effectiveness in terms of the quality of the alignment and computational cost.

Original language	English (US)
Article number	71
Journal	BMC bioinformatics
Volume	11
DOIs	https://doi.org/10.1186/1471-2105-11-71
State	Published - Feb 2 2010

ASJC Scopus subject areas

Structural Biology
Biochemistry
Molecular Biology
Computer Science Applications
Applied Mathematics

Access to Document

10.1186/1471-2105-11-71

Cite this

@article{4d05e774a84d404ca591a1db87b03ebf,

title = "Multiple structure alignment and consensus identification for proteins",

abstract = "Background: An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.Results: Experimental results show that the algorithm converges quite rapidly and generates consensus structures that are visually similar to the input proteins. A comparison with other coordinate-based alignment algorithms (MAMMOTH and MATT) shows that the proposed algorithm is competitive in terms of speed and the sizes of the conserved regions discovered in an extensive benchmark dataset derived from the HOMSTRAD and SABmark databases.The algorithm has been implemented in C++ and can be downloaded from the project's web page. Alternatively, the algorithm can be used via a web server which makes it possible to align protein structures by uploading files from local disk or by downloading protein data from the RCSB Protein Data Bank.Conclusions: An algorithm is presented to compute a multiple structure alignment for a set of proteins, together with their consensus structure. Experimental results show its effectiveness in terms of the quality of the alignment and computational cost.",

author = "Ivaylo Ilinkin and Jieping Ye and Ravi Janardan",

note = "Funding Information: Research of JY and RJ was sponsored, in part, by the Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laboratory cooperative agreement number DAAD19-01-2-0014, the content of which does not necessarily reflect the position or the policy of the government, and no official endorsement should be inferred. Adam Isom contributed to the implementation of the web-based tool. The authors thank the MAMMOTH and MATT teams for sharing their source code.",

year = "2010",

month = feb,

day = "2",

doi = "10.1186/1471-2105-11-71",

language = "English (US)",

volume = "11",

journal = "BMC bioinformatics",

issn = "1471-2105",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Multiple structure alignment and consensus identification for proteins

AU - Ilinkin, Ivaylo

AU - Ye, Jieping

AU - Janardan, Ravi

N1 - Funding Information: Research of JY and RJ was sponsored, in part, by the Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laboratory cooperative agreement number DAAD19-01-2-0014, the content of which does not necessarily reflect the position or the policy of the government, and no official endorsement should be inferred. Adam Isom contributed to the implementation of the web-based tool. The authors thank the MAMMOTH and MATT teams for sharing their source code.

PY - 2010/2/2

Y1 - 2010/2/2

N2 - Background: An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.Results: Experimental results show that the algorithm converges quite rapidly and generates consensus structures that are visually similar to the input proteins. A comparison with other coordinate-based alignment algorithms (MAMMOTH and MATT) shows that the proposed algorithm is competitive in terms of speed and the sizes of the conserved regions discovered in an extensive benchmark dataset derived from the HOMSTRAD and SABmark databases.The algorithm has been implemented in C++ and can be downloaded from the project's web page. Alternatively, the algorithm can be used via a web server which makes it possible to align protein structures by uploading files from local disk or by downloading protein data from the RCSB Protein Data Bank.Conclusions: An algorithm is presented to compute a multiple structure alignment for a set of proteins, together with their consensus structure. Experimental results show its effectiveness in terms of the quality of the alignment and computational cost.

AB - Background: An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.Results: Experimental results show that the algorithm converges quite rapidly and generates consensus structures that are visually similar to the input proteins. A comparison with other coordinate-based alignment algorithms (MAMMOTH and MATT) shows that the proposed algorithm is competitive in terms of speed and the sizes of the conserved regions discovered in an extensive benchmark dataset derived from the HOMSTRAD and SABmark databases.The algorithm has been implemented in C++ and can be downloaded from the project's web page. Alternatively, the algorithm can be used via a web server which makes it possible to align protein structures by uploading files from local disk or by downloading protein data from the RCSB Protein Data Bank.Conclusions: An algorithm is presented to compute a multiple structure alignment for a set of proteins, together with their consensus structure. Experimental results show its effectiveness in terms of the quality of the alignment and computational cost.

UR - http://www.scopus.com/inward/record.url?scp=77649142047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77649142047&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-11-71

DO - 10.1186/1471-2105-11-71

M3 - Article

C2 - 20122279

AN - SCOPUS:77649142047

SN - 1471-2105

VL - 11

JO - BMC bioinformatics

JF - BMC bioinformatics

M1 - 71

ER -

Multiple structure alignment and consensus identification for proteins

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this