Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data

Takahiro Maruki, Michael Lynch

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.

Original languageEnglish (US)
Pages (from-to)1303-1313
Number of pages11
JournalGenetics
Volume197
Issue number4
DOIs
StatePublished - Jan 1 2014
Externally publishedYes

Fingerprint

Linkage Disequilibrium
Genome
Population
Selection Bias
Diploidy
Computer Simulation
Haplotypes
Chromosomes
Technology

Keywords

  • Linkage disequilibrium
  • Population genomics

ASJC Scopus subject areas

  • Genetics

Cite this

Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data. / Maruki, Takahiro; Lynch, Michael.

In: Genetics, Vol. 197, No. 4, 01.01.2014, p. 1303-1313.

Research output: Contribution to journalArticle

@article{0830897ed01d4ba7a7d3d33e82047429,
title = "Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data",
abstract = "Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.",
keywords = "Linkage disequilibrium, Population genomics",
author = "Takahiro Maruki and Michael Lynch",
year = "2014",
month = "1",
day = "1",
doi = "10.1534/genetics.114.165514",
language = "English (US)",
volume = "197",
pages = "1303--1313",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "4",

}

TY - JOUR

T1 - Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data

AU - Maruki, Takahiro

AU - Lynch, Michael

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.

AB - Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.

KW - Linkage disequilibrium

KW - Population genomics

UR - http://www.scopus.com/inward/record.url?scp=84905656768&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905656768&partnerID=8YFLogxK

U2 - 10.1534/genetics.114.165514

DO - 10.1534/genetics.114.165514

M3 - Article

VL - 197

SP - 1303

EP - 1313

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 4

ER -