Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations

Sudhir Kumar, Michael P. Suleski, Glenn J. Markov, Simon Lawrence, Antonio Marco, Alan J. Filipski

Research output: Contribution to journalArticle

40 Citations (Scopus)

Abstract

As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15%-40% of the time. We find that accurate discrimination between benign and deleterious mutations is strongly influenced by the long-term (among species) history of positions that harbor those mutations. Successful prediction of known diseaseassociated mutations (DAMs) is much higher for evolutionarily conserved positions and for original-mutant amino acid pairs that are rarely seen among species. Prediction accuracies for nSNPs show opposite patterns, forecasting impediments to building diagnostic tools aiming to simultaneously reduce both false-positive and false-negative errors. The relative allele frequencies of mutations diagnosed as benign and damaging are predicted by positional evolutionary rates. These allele frequencies are modulated by the relative preponderance of the mutant allele in the set of amino acids found at homologous sites in other species (evolutionarily permissible alleles [EPAs]). The nSNPs found in EPAs are biochemically less severe than those missing from EPAs across all allele frequency categories. Therefore, it is important to consider position evolutionary rates and EPAs when interpreting the consequences and population frequencies of human mutations. The impending sequencing of thousands of human and many more vertebrate genomes will lead to more accurate classifiers needed in real-world applications.

Original languageEnglish (US)
Pages (from-to)1562-1569
Number of pages8
JournalGenome Research
Volume19
Issue number9
DOIs
StatePublished - Sep 2009

Fingerprint

Alleles
Amino Acids
Mutation
Gene Frequency
Genome
Population
Mutation Rate
DNA Sequence Analysis
Single Nucleotide Polymorphism
Vertebrates
Genotype
Costs and Cost Analysis
Genes

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations. / Kumar, Sudhir; Suleski, Michael P.; Markov, Glenn J.; Lawrence, Simon; Marco, Antonio; Filipski, Alan J.

In: Genome Research, Vol. 19, No. 9, 09.2009, p. 1562-1569.

Research output: Contribution to journalArticle

Kumar, Sudhir ; Suleski, Michael P. ; Markov, Glenn J. ; Lawrence, Simon ; Marco, Antonio ; Filipski, Alan J. / Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations. In: Genome Research. 2009 ; Vol. 19, No. 9. pp. 1562-1569.
@article{abd97c2365e34864826f72088da974ea,
title = "Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations",
abstract = "As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15{\%}-40{\%} of the time. We find that accurate discrimination between benign and deleterious mutations is strongly influenced by the long-term (among species) history of positions that harbor those mutations. Successful prediction of known diseaseassociated mutations (DAMs) is much higher for evolutionarily conserved positions and for original-mutant amino acid pairs that are rarely seen among species. Prediction accuracies for nSNPs show opposite patterns, forecasting impediments to building diagnostic tools aiming to simultaneously reduce both false-positive and false-negative errors. The relative allele frequencies of mutations diagnosed as benign and damaging are predicted by positional evolutionary rates. These allele frequencies are modulated by the relative preponderance of the mutant allele in the set of amino acids found at homologous sites in other species (evolutionarily permissible alleles [EPAs]). The nSNPs found in EPAs are biochemically less severe than those missing from EPAs across all allele frequency categories. Therefore, it is important to consider position evolutionary rates and EPAs when interpreting the consequences and population frequencies of human mutations. The impending sequencing of thousands of human and many more vertebrate genomes will lead to more accurate classifiers needed in real-world applications.",
author = "Sudhir Kumar and Suleski, {Michael P.} and Markov, {Glenn J.} and Simon Lawrence and Antonio Marco and Filipski, {Alan J.}",
year = "2009",
month = "9",
doi = "10.1101/gr.091991.109",
language = "English (US)",
volume = "19",
pages = "1562--1569",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "9",

}

TY - JOUR

T1 - Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations

AU - Kumar, Sudhir

AU - Suleski, Michael P.

AU - Markov, Glenn J.

AU - Lawrence, Simon

AU - Marco, Antonio

AU - Filipski, Alan J.

PY - 2009/9

Y1 - 2009/9

N2 - As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15%-40% of the time. We find that accurate discrimination between benign and deleterious mutations is strongly influenced by the long-term (among species) history of positions that harbor those mutations. Successful prediction of known diseaseassociated mutations (DAMs) is much higher for evolutionarily conserved positions and for original-mutant amino acid pairs that are rarely seen among species. Prediction accuracies for nSNPs show opposite patterns, forecasting impediments to building diagnostic tools aiming to simultaneously reduce both false-positive and false-negative errors. The relative allele frequencies of mutations diagnosed as benign and damaging are predicted by positional evolutionary rates. These allele frequencies are modulated by the relative preponderance of the mutant allele in the set of amino acids found at homologous sites in other species (evolutionarily permissible alleles [EPAs]). The nSNPs found in EPAs are biochemically less severe than those missing from EPAs across all allele frequency categories. Therefore, it is important to consider position evolutionary rates and EPAs when interpreting the consequences and population frequencies of human mutations. The impending sequencing of thousands of human and many more vertebrate genomes will lead to more accurate classifiers needed in real-world applications.

AB - As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15%-40% of the time. We find that accurate discrimination between benign and deleterious mutations is strongly influenced by the long-term (among species) history of positions that harbor those mutations. Successful prediction of known diseaseassociated mutations (DAMs) is much higher for evolutionarily conserved positions and for original-mutant amino acid pairs that are rarely seen among species. Prediction accuracies for nSNPs show opposite patterns, forecasting impediments to building diagnostic tools aiming to simultaneously reduce both false-positive and false-negative errors. The relative allele frequencies of mutations diagnosed as benign and damaging are predicted by positional evolutionary rates. These allele frequencies are modulated by the relative preponderance of the mutant allele in the set of amino acids found at homologous sites in other species (evolutionarily permissible alleles [EPAs]). The nSNPs found in EPAs are biochemically less severe than those missing from EPAs across all allele frequency categories. Therefore, it is important to consider position evolutionary rates and EPAs when interpreting the consequences and population frequencies of human mutations. The impending sequencing of thousands of human and many more vertebrate genomes will lead to more accurate classifiers needed in real-world applications.

UR - http://www.scopus.com/inward/record.url?scp=69749093207&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69749093207&partnerID=8YFLogxK

U2 - 10.1101/gr.091991.109

DO - 10.1101/gr.091991.109

M3 - Article

C2 - 19546171

AN - SCOPUS:69749093207

VL - 19

SP - 1562

EP - 1569

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 9

ER -