SpartaABC: A web server to simulate sequences with indel parameters inferred using an approximate Bayesian computation algorithm

Haim Ashkenazy, Eli Levy Karin, Zach Mertens, Reed Cartwright, Tal Pupko

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Many analyses for the detection of biological phenomena rely on a multiple sequence alignment as input. The results of such analyses are often further studied through parametric bootstrap procedures, using sequence simulators. One of the problems with conducting such simulation studies is that users currently have no means to decide which insertion and deletion (indel) parameters to choose, so that the resulting sequences mimic biological data. Here, we present SpartaABC, a web server that aims to solve this issue. SpartaABC implements an approximate-Bayesian-computation rejection algorithm to infer indel parameters from sequence data. It does so by extracting summary statistics from the input. It then performs numerous sequence simulations under randomly sampled indel parameters. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC retains only parameters behind simulations close to the real data. As output, SpartaABC provides point estimates and approximate posterior distributions of the indel parameters. In addition, SpartaABC allows simulating sequences with the inferred indel parameters. To this end, the sequence simulators, Dawg 2.0 and INDELible were integrated. Using SpartaABC we demonstrate the differences in indel dynamics among three protein-coding genes across mammalian orthologs.

Original languageEnglish (US)
Pages (from-to)W453-W457
JournalNucleic Acids Research
Volume45
Issue numberW1
DOIs
StatePublished - Jul 3 2017

Fingerprint

Sequence Deletion
Insertional Mutagenesis
Biological Phenomena
Sequence Alignment
Proteins

ASJC Scopus subject areas

  • Genetics

Cite this

SpartaABC : A web server to simulate sequences with indel parameters inferred using an approximate Bayesian computation algorithm. / Ashkenazy, Haim; Levy Karin, Eli; Mertens, Zach; Cartwright, Reed; Pupko, Tal.

In: Nucleic Acids Research, Vol. 45, No. W1, 03.07.2017, p. W453-W457.

Research output: Contribution to journalArticle

@article{d6901f04c01449129c9ecfdbda283057,
title = "SpartaABC: A web server to simulate sequences with indel parameters inferred using an approximate Bayesian computation algorithm",
abstract = "Many analyses for the detection of biological phenomena rely on a multiple sequence alignment as input. The results of such analyses are often further studied through parametric bootstrap procedures, using sequence simulators. One of the problems with conducting such simulation studies is that users currently have no means to decide which insertion and deletion (indel) parameters to choose, so that the resulting sequences mimic biological data. Here, we present SpartaABC, a web server that aims to solve this issue. SpartaABC implements an approximate-Bayesian-computation rejection algorithm to infer indel parameters from sequence data. It does so by extracting summary statistics from the input. It then performs numerous sequence simulations under randomly sampled indel parameters. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC retains only parameters behind simulations close to the real data. As output, SpartaABC provides point estimates and approximate posterior distributions of the indel parameters. In addition, SpartaABC allows simulating sequences with the inferred indel parameters. To this end, the sequence simulators, Dawg 2.0 and INDELible were integrated. Using SpartaABC we demonstrate the differences in indel dynamics among three protein-coding genes across mammalian orthologs.",
author = "Haim Ashkenazy and {Levy Karin}, Eli and Zach Mertens and Reed Cartwright and Tal Pupko",
year = "2017",
month = "7",
day = "3",
doi = "10.1093/nar/gkx322",
language = "English (US)",
volume = "45",
pages = "W453--W457",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "W1",

}

TY - JOUR

T1 - SpartaABC

T2 - A web server to simulate sequences with indel parameters inferred using an approximate Bayesian computation algorithm

AU - Ashkenazy, Haim

AU - Levy Karin, Eli

AU - Mertens, Zach

AU - Cartwright, Reed

AU - Pupko, Tal

PY - 2017/7/3

Y1 - 2017/7/3

N2 - Many analyses for the detection of biological phenomena rely on a multiple sequence alignment as input. The results of such analyses are often further studied through parametric bootstrap procedures, using sequence simulators. One of the problems with conducting such simulation studies is that users currently have no means to decide which insertion and deletion (indel) parameters to choose, so that the resulting sequences mimic biological data. Here, we present SpartaABC, a web server that aims to solve this issue. SpartaABC implements an approximate-Bayesian-computation rejection algorithm to infer indel parameters from sequence data. It does so by extracting summary statistics from the input. It then performs numerous sequence simulations under randomly sampled indel parameters. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC retains only parameters behind simulations close to the real data. As output, SpartaABC provides point estimates and approximate posterior distributions of the indel parameters. In addition, SpartaABC allows simulating sequences with the inferred indel parameters. To this end, the sequence simulators, Dawg 2.0 and INDELible were integrated. Using SpartaABC we demonstrate the differences in indel dynamics among three protein-coding genes across mammalian orthologs.

AB - Many analyses for the detection of biological phenomena rely on a multiple sequence alignment as input. The results of such analyses are often further studied through parametric bootstrap procedures, using sequence simulators. One of the problems with conducting such simulation studies is that users currently have no means to decide which insertion and deletion (indel) parameters to choose, so that the resulting sequences mimic biological data. Here, we present SpartaABC, a web server that aims to solve this issue. SpartaABC implements an approximate-Bayesian-computation rejection algorithm to infer indel parameters from sequence data. It does so by extracting summary statistics from the input. It then performs numerous sequence simulations under randomly sampled indel parameters. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC retains only parameters behind simulations close to the real data. As output, SpartaABC provides point estimates and approximate posterior distributions of the indel parameters. In addition, SpartaABC allows simulating sequences with the inferred indel parameters. To this end, the sequence simulators, Dawg 2.0 and INDELible were integrated. Using SpartaABC we demonstrate the differences in indel dynamics among three protein-coding genes across mammalian orthologs.

UR - http://www.scopus.com/inward/record.url?scp=85023175336&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85023175336&partnerID=8YFLogxK

U2 - 10.1093/nar/gkx322

DO - 10.1093/nar/gkx322

M3 - Article

C2 - 28460062

AN - SCOPUS:85023175336

VL - 45

SP - W453-W457

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - W1

ER -