Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects

Research output: Contribution to journalArticle

75 Citations (Scopus)

Abstract

Recent advances in sequencing strategies have made it feasible to rapidly obtain high-coverage genomic profiles of single individuals, and soon it will be economically feasible to do so with hundreds to thousands of individuals per population. While offering unprecedented power for the acquisition of population-genetic parameters, these new methods also introduce a number of challenges, most notably the need to account for the binomial sampling of parental alleles at individual nucleotide sites and to eliminate bias from various sources of sequence errors. To minimize the effects of both problems, methods are developed for generating nearly unbiased and minimum-sampling- variance estimates of a number of key parameters, including the average nucleotide heterozygosity and its variance among sites, the pattern of decomposition of linkage disequilibrium with physical distance, and the rate and molecular spectrum of spontaneously arising mutations. These methods provide a general platform for the efficient utilization of data from population-genomic surveys, while also providing guidance for the optimal design of such studies.

Original languageEnglish (US)
Pages (from-to)2409-2419
Number of pages11
JournalMolecular Biology and Evolution
Volume25
Issue number11
DOIs
StatePublished - Nov 1 2008
Externally publishedYes

Fingerprint

Mutation Rate
disequilibrium
mutation
Nucleotides
genome
nucleotides
Genome
genomics
Metagenomics
Linkage Disequilibrium
Population Genetics
sampling
linkage disequilibrium
heterozygosity
population genetics
allele
Research Design
experimental design
methodology
Alleles

Keywords

  • Genome scans
  • Heterozygosity
  • Linkage disequilibrium
  • Maximum likelihood estimation
  • Mutation rate
  • Mutation spectrum
  • Nucleotide diversity

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Cite this

@article{c193db4b50af499c98a19fdd671978cd,
title = "Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects",
abstract = "Recent advances in sequencing strategies have made it feasible to rapidly obtain high-coverage genomic profiles of single individuals, and soon it will be economically feasible to do so with hundreds to thousands of individuals per population. While offering unprecedented power for the acquisition of population-genetic parameters, these new methods also introduce a number of challenges, most notably the need to account for the binomial sampling of parental alleles at individual nucleotide sites and to eliminate bias from various sources of sequence errors. To minimize the effects of both problems, methods are developed for generating nearly unbiased and minimum-sampling- variance estimates of a number of key parameters, including the average nucleotide heterozygosity and its variance among sites, the pattern of decomposition of linkage disequilibrium with physical distance, and the rate and molecular spectrum of spontaneously arising mutations. These methods provide a general platform for the efficient utilization of data from population-genomic surveys, while also providing guidance for the optimal design of such studies.",
keywords = "Genome scans, Heterozygosity, Linkage disequilibrium, Maximum likelihood estimation, Mutation rate, Mutation spectrum, Nucleotide diversity",
author = "Michael Lynch",
year = "2008",
month = "11",
day = "1",
doi = "10.1093/molbev/msn185",
language = "English (US)",
volume = "25",
pages = "2409--2419",
journal = "Molecular Biology and Evolution",
issn = "0737-4038",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects

AU - Lynch, Michael

PY - 2008/11/1

Y1 - 2008/11/1

N2 - Recent advances in sequencing strategies have made it feasible to rapidly obtain high-coverage genomic profiles of single individuals, and soon it will be economically feasible to do so with hundreds to thousands of individuals per population. While offering unprecedented power for the acquisition of population-genetic parameters, these new methods also introduce a number of challenges, most notably the need to account for the binomial sampling of parental alleles at individual nucleotide sites and to eliminate bias from various sources of sequence errors. To minimize the effects of both problems, methods are developed for generating nearly unbiased and minimum-sampling- variance estimates of a number of key parameters, including the average nucleotide heterozygosity and its variance among sites, the pattern of decomposition of linkage disequilibrium with physical distance, and the rate and molecular spectrum of spontaneously arising mutations. These methods provide a general platform for the efficient utilization of data from population-genomic surveys, while also providing guidance for the optimal design of such studies.

AB - Recent advances in sequencing strategies have made it feasible to rapidly obtain high-coverage genomic profiles of single individuals, and soon it will be economically feasible to do so with hundreds to thousands of individuals per population. While offering unprecedented power for the acquisition of population-genetic parameters, these new methods also introduce a number of challenges, most notably the need to account for the binomial sampling of parental alleles at individual nucleotide sites and to eliminate bias from various sources of sequence errors. To minimize the effects of both problems, methods are developed for generating nearly unbiased and minimum-sampling- variance estimates of a number of key parameters, including the average nucleotide heterozygosity and its variance among sites, the pattern of decomposition of linkage disequilibrium with physical distance, and the rate and molecular spectrum of spontaneously arising mutations. These methods provide a general platform for the efficient utilization of data from population-genomic surveys, while also providing guidance for the optimal design of such studies.

KW - Genome scans

KW - Heterozygosity

KW - Linkage disequilibrium

KW - Maximum likelihood estimation

KW - Mutation rate

KW - Mutation spectrum

KW - Nucleotide diversity

UR - http://www.scopus.com/inward/record.url?scp=54149110986&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=54149110986&partnerID=8YFLogxK

U2 - 10.1093/molbev/msn185

DO - 10.1093/molbev/msn185

M3 - Article

VL - 25

SP - 2409

EP - 2419

JO - Molecular Biology and Evolution

JF - Molecular Biology and Evolution

SN - 0737-4038

IS - 11

ER -