TY - JOUR
T1 - The impact of frequently neglected model violations on bacterial recombination rate estimation
T2 - a case study in Mycobacterium canettii and Mycobacterium tuberculosis
AU - Sabin, Susanna
AU - Morales-Arce, Ana Y.
AU - Pfeifer, Susanne P.
AU - Jensen, Jeffrey D.
N1 - Funding Information:
Funding for this project was provided by the Center for Evolution and Medicine at Arizona State University, and National Institutes of Health grants R01GM135899 and R35GM139383 to JDJ. SPP is supported by a US National Science Foundation CAREER grant (DEB-2045343).
Funding Information:
Computation was performed using the Agave research computing infrastructure at Arizona State University, and the Open Science Grid which is supported by the National Science Foundation and the U.S. Department of Energy’s Office of Science.
Publisher Copyright:
© The Author(s) 2022.
PY - 2022/5
Y1 - 2022/5
N2 - Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species that gave rise to the complex. Here, we leverage whole-genome sequencing data and biologically relevant population genomic models to compare the evolutionary dynamics driving variation in the recombining M. canettii with that in the nonrecombining M. tuberculosis complex, and discuss differences in observed genomic diversity in the light of expected levels of Hill–Robertson interference. In doing so, we highlight the methodological challenges of estimating recombination rates through traditional population genetic approaches using sequences called from populations of microorganisms and evaluate the likely mis-inference that arises owing to a neglect of common model violations including purifying selection, background selection, progeny skew, and population size change. In addition, we compare performance when full within-host polymorphism data are utilized, versus the more common approach of basing analyses on within-host consensus sequences.
AB - Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species that gave rise to the complex. Here, we leverage whole-genome sequencing data and biologically relevant population genomic models to compare the evolutionary dynamics driving variation in the recombining M. canettii with that in the nonrecombining M. tuberculosis complex, and discuss differences in observed genomic diversity in the light of expected levels of Hill–Robertson interference. In doing so, we highlight the methodological challenges of estimating recombination rates through traditional population genetic approaches using sequences called from populations of microorganisms and evaluate the likely mis-inference that arises owing to a neglect of common model violations including purifying selection, background selection, progeny skew, and population size change. In addition, we compare performance when full within-host polymorphism data are utilized, versus the more common approach of basing analyses on within-host consensus sequences.
KW - Hill–Robertson interference
KW - LDhat
KW - Mycobacterium canettii
KW - Mycobacterium tuberculosis
KW - genetic hitchhiking
KW - population genomics
KW - progeny skew
KW - recombination rate estimation
UR - http://www.scopus.com/inward/record.url?scp=85129998435&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129998435&partnerID=8YFLogxK
U2 - 10.1093/g3journal/jkac055
DO - 10.1093/g3journal/jkac055
M3 - Article
C2 - 35253851
AN - SCOPUS:85129998435
SN - 2160-1836
VL - 12
JO - G3 (Bethesda, Md.)
JF - G3 (Bethesda, Md.)
IS - 5
M1 - jkac055
ER -