Evolutionary distance estimation and fidelity of pair wise sequence alignment

Michael S. Rosenberg

Research output: Contribution to journalArticlepeer-review

35 Scopus citations

Abstract

Background: Evolutionary distances are a critical measure in comparative genomics and molecular evolutionary biology. A simulation study was used to examine the effect of alignment accuracy of DNA sequences on evolutionary distance estimation. Results: Under the studied conditions, distance estimation was relatively unaffected by alignment error (50% or more of the sites incorrectly aligned) as long as 50% or more of the sites were identical among the sequences (observed P-distance < 0.5). Beyond this threshold, the alignment procedure artificially inflates the apparent sequence identity, skewing distance estimates, and creating alignments that are essentially indistinguishable from random data. This general result was independent of substitution model, sequence length, and insertion and deletion size and rate. Conclusion: Examination of the estimated sequence identity may yield some guidance as to the accuracy of the alignment. Inaccurate alignments are expected to have large effects on analyses dependent on site specificity, but analyses that depend on evolutionary distance may be somewhat robust to alignment error as long as fewer than half of the sites have diverged.

Original languageEnglish (US)
Article number102
JournalBMC bioinformatics
Volume6
DOIs
StatePublished - Apr 19 2005

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Evolutionary distance estimation and fidelity of pair wise sequence alignment'. Together they form a unique fingerprint.

Cite this