Cohesins and dockerins are two groups of structural modules critical to the assembly of cellulosomes, the protein complexes for biomass degradation in cellulolytic microorganisms. To understand their sequence diversity and evolution, we re-mined five well-studied cellulosome-producing genomes in the order of Clostridiales for both domain modules and performed a comparative study. We found many putatively new cohesins and dockerins (designated as NEW) besides those identified previously (PUB). We showed that a significant number of NEW domain-containing proteins have signatures to be cellulosome-related, such as having CAZyme (carbohydrate-active enzymes) domains or other cellulosome-related domains, differentially expressed in cellulose degradation, having similar GC content and codon adaptation index as PUB domains. Evolutionary analyses suggest that NEW domains are more divergent in sequence than PUB domains, and PUB domains evolve more rapidly than CAZyme domains. Phylogenetic and functional domain analyses revealed that duplications play a major role in the expansion of dockerins and cohesins. For example, orthologous proteins of closely related strains have different numbers of cohesins, suggesting duplications occurred in a genome-specific manner after the strains diverged. High abundance of cohesin- and dockerin-containing proteins in Ruminococcus flavefaciens FD-1 and Acetivibrio cellulolyticus CD2, relative to Clostridium thermocellum, may be a result of gene duplication and horizontal gene transfer under relaxed purifying selection in environments with high species diversity. Overall, this study suggests that dockerins and cohesins evolve with a high rate of domain birth and death in cellulolytic microorganisms.
ASJC Scopus subject areas
- Renewable Energy, Sustainability and the Environment
- Agronomy and Crop Science
- Energy (miscellaneous)