Genotype-frequency estimation from high-throughput sequencing data

Takahiro Maruki, Michael Lynch

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.

Original languageEnglish (US)
Pages (from-to)473-486
Number of pages14
JournalGenetics
Volume201
Issue number2
DOIs
StatePublished - Oct 1 2015
Externally publishedYes

Fingerprint

Genotype
Metagenomics
Gene Frequency
Computer Simulation
Uncertainty
Software
Chromosomes
Technology
Population

Keywords

  • Genotype frequency
  • Hardy-Weinberg test
  • Inbreeding coefficient
  • Polymorphism detection
  • Population genomics

ASJC Scopus subject areas

  • Genetics

Cite this

Genotype-frequency estimation from high-throughput sequencing data. / Maruki, Takahiro; Lynch, Michael.

In: Genetics, Vol. 201, No. 2, 01.10.2015, p. 473-486.

Research output: Contribution to journalArticle

Maruki, Takahiro ; Lynch, Michael. / Genotype-frequency estimation from high-throughput sequencing data. In: Genetics. 2015 ; Vol. 201, No. 2. pp. 473-486.
@article{7befff23869c4d2e8429c23a6ea1aeaa,
title = "Genotype-frequency estimation from high-throughput sequencing data",
abstract = "Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.",
keywords = "Genotype frequency, Hardy-Weinberg test, Inbreeding coefficient, Polymorphism detection, Population genomics",
author = "Takahiro Maruki and Michael Lynch",
year = "2015",
month = "10",
day = "1",
doi = "10.1534/genetics.115.179077",
language = "English (US)",
volume = "201",
pages = "473--486",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "2",

}

TY - JOUR

T1 - Genotype-frequency estimation from high-throughput sequencing data

AU - Maruki, Takahiro

AU - Lynch, Michael

PY - 2015/10/1

Y1 - 2015/10/1

N2 - Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.

AB - Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.

KW - Genotype frequency

KW - Hardy-Weinberg test

KW - Inbreeding coefficient

KW - Polymorphism detection

KW - Population genomics

UR - http://www.scopus.com/inward/record.url?scp=84943606990&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84943606990&partnerID=8YFLogxK

U2 - 10.1534/genetics.115.179077

DO - 10.1534/genetics.115.179077

M3 - Article

VL - 201

SP - 473

EP - 486

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 2

ER -