### Abstract

The rapid expansion of sequence data and the development of statistical approaches that embrace varying evolutionary rates among lineages have encouraged many more investigators to use DNA and protein data to time species divergences. Here, we report results from a systematic evaluation, by means of computer simulation, of the performance of two frequently used relaxed-clock methods for estimating these times and their credibility intervals (CrIs). These relaxed-clock methods allow rates to vary in a phylogeny randomly over lineages (e.g., BEAST software) and in autocorrelated fashion (e.g., MultiDivTime software). We applied these methods for analyzing sequence data sets simulated using naturally derived parameters (evolutionary rates, sequence lengths, and base substitution patterns) and assuming that clock calibrations are known without error. We find that the estimated times are, on average, close to the true times as long as the assumed model of lineage rate changes matches the actual model. The 95% CrIs also contain the true time for ≥95% of the simulated data sets. However, the use of incorrect lineage rate model reduces this frequency to 83%, indicating that the relaxed-clock methods are not robust to the violation of underlying lineage rate model. Because these rate models are rarely known a priori and are difficult to detect empirically, we suggest building composite CrIs using CrIs produced from MultiDivTime and BEAST analysis. These composite CrIs are found to contain the true time for ≥97% data sets. Our analyses also verify the usefulness of the common practice of interpreting the congruence of times inferred from different methods as a reflection of the accuracy of time estimates. Overall, our results show that simple strategies can be used to enhance our ability to estimate times and their CrIs when using the relaxed-clock methods.

Original language | English (US) |
---|---|

Pages (from-to) | 1289-1300 |

Number of pages | 12 |

Journal | Molecular Biology and Evolution |

Volume | 27 |

Issue number | 6 |

DOIs | |

State | Published - 2010 |

### Fingerprint

### Keywords

- Credibility intervals
- Divergence times
- Lineage rate models
- Molecular clocks
- Simulations

### ASJC Scopus subject areas

- Genetics
- Molecular Biology
- Ecology, Evolution, Behavior and Systematics

### Cite this

*Molecular Biology and Evolution*,

*27*(6), 1289-1300. https://doi.org/10.1093/molbev/msq014

**Performance of relaxed-clock methods in estimating evolutionary divergence times and their credibility intervals.** / Battistuzzi, Fabia U.; Filipski, Alan; Hedges, S. Blair; Kumar, Sudhir.

Research output: Contribution to journal › Article

*Molecular Biology and Evolution*, vol. 27, no. 6, pp. 1289-1300. https://doi.org/10.1093/molbev/msq014

}

TY - JOUR

T1 - Performance of relaxed-clock methods in estimating evolutionary divergence times and their credibility intervals

AU - Battistuzzi, Fabia U.

AU - Filipski, Alan

AU - Hedges, S. Blair

AU - Kumar, Sudhir

PY - 2010

Y1 - 2010

N2 - The rapid expansion of sequence data and the development of statistical approaches that embrace varying evolutionary rates among lineages have encouraged many more investigators to use DNA and protein data to time species divergences. Here, we report results from a systematic evaluation, by means of computer simulation, of the performance of two frequently used relaxed-clock methods for estimating these times and their credibility intervals (CrIs). These relaxed-clock methods allow rates to vary in a phylogeny randomly over lineages (e.g., BEAST software) and in autocorrelated fashion (e.g., MultiDivTime software). We applied these methods for analyzing sequence data sets simulated using naturally derived parameters (evolutionary rates, sequence lengths, and base substitution patterns) and assuming that clock calibrations are known without error. We find that the estimated times are, on average, close to the true times as long as the assumed model of lineage rate changes matches the actual model. The 95% CrIs also contain the true time for ≥95% of the simulated data sets. However, the use of incorrect lineage rate model reduces this frequency to 83%, indicating that the relaxed-clock methods are not robust to the violation of underlying lineage rate model. Because these rate models are rarely known a priori and are difficult to detect empirically, we suggest building composite CrIs using CrIs produced from MultiDivTime and BEAST analysis. These composite CrIs are found to contain the true time for ≥97% data sets. Our analyses also verify the usefulness of the common practice of interpreting the congruence of times inferred from different methods as a reflection of the accuracy of time estimates. Overall, our results show that simple strategies can be used to enhance our ability to estimate times and their CrIs when using the relaxed-clock methods.

AB - The rapid expansion of sequence data and the development of statistical approaches that embrace varying evolutionary rates among lineages have encouraged many more investigators to use DNA and protein data to time species divergences. Here, we report results from a systematic evaluation, by means of computer simulation, of the performance of two frequently used relaxed-clock methods for estimating these times and their credibility intervals (CrIs). These relaxed-clock methods allow rates to vary in a phylogeny randomly over lineages (e.g., BEAST software) and in autocorrelated fashion (e.g., MultiDivTime software). We applied these methods for analyzing sequence data sets simulated using naturally derived parameters (evolutionary rates, sequence lengths, and base substitution patterns) and assuming that clock calibrations are known without error. We find that the estimated times are, on average, close to the true times as long as the assumed model of lineage rate changes matches the actual model. The 95% CrIs also contain the true time for ≥95% of the simulated data sets. However, the use of incorrect lineage rate model reduces this frequency to 83%, indicating that the relaxed-clock methods are not robust to the violation of underlying lineage rate model. Because these rate models are rarely known a priori and are difficult to detect empirically, we suggest building composite CrIs using CrIs produced from MultiDivTime and BEAST analysis. These composite CrIs are found to contain the true time for ≥97% data sets. Our analyses also verify the usefulness of the common practice of interpreting the congruence of times inferred from different methods as a reflection of the accuracy of time estimates. Overall, our results show that simple strategies can be used to enhance our ability to estimate times and their CrIs when using the relaxed-clock methods.

KW - Credibility intervals

KW - Divergence times

KW - Lineage rate models

KW - Molecular clocks

KW - Simulations

UR - http://www.scopus.com/inward/record.url?scp=77951997517&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77951997517&partnerID=8YFLogxK

U2 - 10.1093/molbev/msq014

DO - 10.1093/molbev/msq014

M3 - Article

VL - 27

SP - 1289

EP - 1300

JO - Molecular Biology and Evolution

JF - Molecular Biology and Evolution

SN - 0737-4038

IS - 6

ER -