TY - JOUR
T1 - Inferring property selection pressure from positional residue conservation
AU - Hoberman, Rose
AU - Klein-Seetharaman, Judith
AU - Rosenfeld, Roni
N1 - Funding Information:
This research was supported by National Science Foundation Information Technology Research grant NSF0225656 and by the American Society for Engineering Education (ASEE) through a National Defense Science & Engineering Graduate (NDSEG) fellowship. We are grateful to Bill Bruno for his detailed and helpful comments and to Larry Wasserman for numerous helpful discussions.
PY - 2004
Y1 - 2004
N2 - In this study, we attempt to understand and explain positional selection pressure in terms of underlying physical and chemical properties. We propose a set of constraining assumptions about how these pressures behave, then describe a procedure for analysing and explaining the distribution of residues at a particular position in a multiple sequence alignment. In contrast to previous approaches, our model takes into account both amino acid frequencies and a large number of physical-chemical properties. By analysing each property separately, it is possible to identify positions where distinct conservation patterns are present. In addition, the model can easily incorporate sequence weights that adjust for bias in the sample sequences. Finally, a test of statistical significance is provided for our conservation measure. The applicability of this method is demonstrated on two HIV-1 proteins: Nef and Env. The tools, data and results presented in this article are available at http://flan.blm.cs.cmu.edu.
AB - In this study, we attempt to understand and explain positional selection pressure in terms of underlying physical and chemical properties. We propose a set of constraining assumptions about how these pressures behave, then describe a procedure for analysing and explaining the distribution of residues at a particular position in a multiple sequence alignment. In contrast to previous approaches, our model takes into account both amino acid frequencies and a large number of physical-chemical properties. By analysing each property separately, it is possible to identify positions where distinct conservation patterns are present. In addition, the model can easily incorporate sequence weights that adjust for bias in the sample sequences. Finally, a test of statistical significance is provided for our conservation measure. The applicability of this method is demonstrated on two HIV-1 proteins: Nef and Env. The tools, data and results presented in this article are available at http://flan.blm.cs.cmu.edu.
UR - http://www.scopus.com/inward/record.url?scp=22744431915&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=22744431915&partnerID=8YFLogxK
U2 - 10.2165/00822942-200403020-00011
DO - 10.2165/00822942-200403020-00011
M3 - Article
C2 - 15693742
AN - SCOPUS:22744431915
SN - 1175-5636
VL - 3
SP - 167
EP - 179
JO - Applied Bioinformatics
JF - Applied Bioinformatics
IS - 2-3
ER -