Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level

Yoshiyuki Suzuki, Takashi Gojobori, Sudhir Kumar

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (rS) and nonsynonymous (rN) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing rS and rN at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN). Computer simulations show that the direction and magnitude of the bias in the estimation of dN/dS caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, dN/dS > 1 was observed for many pairs if the hypermutability was ignored. However, dN/dS becomes ≤1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.

Original languageEnglish (US)
Pages (from-to)2275-2284
Number of pages10
JournalMolecular Biology and Evolution
Volume26
Issue number10
DOIs
StatePublished - Oct 2009

Fingerprint

Genetic Selection
natural selection
codons
Codon
Amino Acid Sequence
amino acid sequences
amino acid
substitution
protamines
mutation
Protamines
Mutation Rate
computer simulation
Computer Simulation
Primates
methodology
primate
organisms
Genes
rate

Keywords

  • CpG dinucleotide
  • Hypermutability
  • Natural selection
  • Nonsynonymous substitution
  • Synonymous substitution

ASJC Scopus subject areas

  • Genetics
  • Molecular Biology
  • Ecology, Evolution, Behavior and Systematics

Cite this

Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level. / Suzuki, Yoshiyuki; Gojobori, Takashi; Kumar, Sudhir.

In: Molecular Biology and Evolution, Vol. 26, No. 10, 10.2009, p. 2275-2284.

Research output: Contribution to journalArticle

@article{2e142ec62c984de3b84680c1af0b10d0,
title = "Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level",
abstract = "In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (rS) and nonsynonymous (rN) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing rS and rN at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN). Computer simulations show that the direction and magnitude of the bias in the estimation of dN/dS caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, dN/dS > 1 was observed for many pairs if the hypermutability was ignored. However, dN/dS becomes ≤1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.",
keywords = "CpG dinucleotide, Hypermutability, Natural selection, Nonsynonymous substitution, Synonymous substitution",
author = "Yoshiyuki Suzuki and Takashi Gojobori and Sudhir Kumar",
year = "2009",
month = "10",
doi = "10.1093/molbev/msp133",
language = "English (US)",
volume = "26",
pages = "2275--2284",
journal = "Molecular Biology and Evolution",
issn = "0737-4038",
publisher = "Oxford University Press",
number = "10",

}

TY - JOUR

T1 - Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level

AU - Suzuki, Yoshiyuki

AU - Gojobori, Takashi

AU - Kumar, Sudhir

PY - 2009/10

Y1 - 2009/10

N2 - In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (rS) and nonsynonymous (rN) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing rS and rN at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN). Computer simulations show that the direction and magnitude of the bias in the estimation of dN/dS caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, dN/dS > 1 was observed for many pairs if the hypermutability was ignored. However, dN/dS becomes ≤1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.

AB - In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (rS) and nonsynonymous (rN) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing rS and rN at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN). Computer simulations show that the direction and magnitude of the bias in the estimation of dN/dS caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, dN/dS > 1 was observed for many pairs if the hypermutability was ignored. However, dN/dS becomes ≤1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.

KW - CpG dinucleotide

KW - Hypermutability

KW - Natural selection

KW - Nonsynonymous substitution

KW - Synonymous substitution

UR - http://www.scopus.com/inward/record.url?scp=70349916085&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349916085&partnerID=8YFLogxK

U2 - 10.1093/molbev/msp133

DO - 10.1093/molbev/msp133

M3 - Article

C2 - 19581348

AN - SCOPUS:70349916085

VL - 26

SP - 2275

EP - 2284

JO - Molecular Biology and Evolution

JF - Molecular Biology and Evolution

SN - 0737-4038

IS - 10

ER -