Effect of LSU and ITS genetic markers and reference databases on analyses of fungal communities

Chao Xue, Yuewen Hao, Xiaowei Pu, Christopher Penton, Qiong Wang, Mengxin Zhao, Bangzhou Zhang, Wei Ran, Qiwei Huang, Qirong Shen, James M. Tiedje

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

The effect of genetic markers and reference databases on analyses of fungal communities were estimated using fungal large subunit (LSU) and internal transcribed spacer (ITS) amplicon datasets in consecutive years of rhizosphere samples from three candidate biofuel crops, corn (Zea mays), switchgrass (Panicum virgatum), and miscanthus (Miscanthus × giganteus). These two marker genes were selected to contrast possible differences in biological conclusions. In addition, two ITS schemes based on two ITS reference databases were used to assess differences due to reference database composition. A taxonomy-supervised method was invoked using the Ribosomal Database Project naïve Bayesian classifier that accesses all three databases. The UNITE classification scheme had the highest number of classified taxa in the raw classification result; however, it also had the highest proportion of unknown taxa (sequences that were classified to “unclassified,” “unidentified,” incertae sedis or, in the case of Warcup, to matches containing two unique names). After removal of these unknown taxa, LSU had the highest classification rate followed by Warcup and UNITE. As expected, the communities resolved using the two ITS databases, based on the same sequences, were relatively more similar than those from the lower-coverage LSU classification scheme. The choice of marker gene or even the same reads with different classification databases revealed different community patterns due to database coverage, e.g., the relative abundance of the most abundant groups changed or were only detected in one or two of the classification schemes, such as for Mortierella, Fusarium, and Phoma. No marked difference in fungal beta-diversity was identified among the three methods. Differentiation between the three biofuel crops and between the drought and normal rainfall years was apparent, regardless of method. Though classification rates, taxonomic conflicts, and coverage differences within the high-abundance fungal groups varied according to classification scheme, there was no overall impact on beta diversity among the three methods.

Original languageEnglish (US)
JournalBiology and Fertility of Soils
DOIs
StateAccepted/In press - Jan 1 2018

Fingerprint

fungal communities
genetic marker
Genetic Markers
internal transcribed spacers
Databases
genetic markers
Panicum
Panicum virgatum
energy crops
Biofuels
biofuel
Zea mays
Mortierella
Miscanthus giganteus
Miscanthus
crop
Phoma
effect
gene
Rhizosphere

Keywords

  • Fungal database
  • ITS
  • LSU
  • UNITE
  • Warcup

ASJC Scopus subject areas

  • Microbiology
  • Agronomy and Crop Science
  • Soil Science

Cite this

Effect of LSU and ITS genetic markers and reference databases on analyses of fungal communities. / Xue, Chao; Hao, Yuewen; Pu, Xiaowei; Penton, Christopher; Wang, Qiong; Zhao, Mengxin; Zhang, Bangzhou; Ran, Wei; Huang, Qiwei; Shen, Qirong; Tiedje, James M.

In: Biology and Fertility of Soils, 01.01.2018.

Research output: Contribution to journalArticle

Xue, Chao ; Hao, Yuewen ; Pu, Xiaowei ; Penton, Christopher ; Wang, Qiong ; Zhao, Mengxin ; Zhang, Bangzhou ; Ran, Wei ; Huang, Qiwei ; Shen, Qirong ; Tiedje, James M. / Effect of LSU and ITS genetic markers and reference databases on analyses of fungal communities. In: Biology and Fertility of Soils. 2018.
@article{99bc29e2ce264746a588d4cfbe1da247,
title = "Effect of LSU and ITS genetic markers and reference databases on analyses of fungal communities",
abstract = "The effect of genetic markers and reference databases on analyses of fungal communities were estimated using fungal large subunit (LSU) and internal transcribed spacer (ITS) amplicon datasets in consecutive years of rhizosphere samples from three candidate biofuel crops, corn (Zea mays), switchgrass (Panicum virgatum), and miscanthus (Miscanthus × giganteus). These two marker genes were selected to contrast possible differences in biological conclusions. In addition, two ITS schemes based on two ITS reference databases were used to assess differences due to reference database composition. A taxonomy-supervised method was invoked using the Ribosomal Database Project na{\"i}ve Bayesian classifier that accesses all three databases. The UNITE classification scheme had the highest number of classified taxa in the raw classification result; however, it also had the highest proportion of unknown taxa (sequences that were classified to “unclassified,” “unidentified,” incertae sedis or, in the case of Warcup, to matches containing two unique names). After removal of these unknown taxa, LSU had the highest classification rate followed by Warcup and UNITE. As expected, the communities resolved using the two ITS databases, based on the same sequences, were relatively more similar than those from the lower-coverage LSU classification scheme. The choice of marker gene or even the same reads with different classification databases revealed different community patterns due to database coverage, e.g., the relative abundance of the most abundant groups changed or were only detected in one or two of the classification schemes, such as for Mortierella, Fusarium, and Phoma. No marked difference in fungal beta-diversity was identified among the three methods. Differentiation between the three biofuel crops and between the drought and normal rainfall years was apparent, regardless of method. Though classification rates, taxonomic conflicts, and coverage differences within the high-abundance fungal groups varied according to classification scheme, there was no overall impact on beta diversity among the three methods.",
keywords = "Fungal database, ITS, LSU, UNITE, Warcup",
author = "Chao Xue and Yuewen Hao and Xiaowei Pu and Christopher Penton and Qiong Wang and Mengxin Zhao and Bangzhou Zhang and Wei Ran and Qiwei Huang and Qirong Shen and Tiedje, {James M.}",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/s00374-018-1331-4",
language = "English (US)",
journal = "Biology and Fertility of Soils",
issn = "0178-2762",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Effect of LSU and ITS genetic markers and reference databases on analyses of fungal communities

AU - Xue, Chao

AU - Hao, Yuewen

AU - Pu, Xiaowei

AU - Penton, Christopher

AU - Wang, Qiong

AU - Zhao, Mengxin

AU - Zhang, Bangzhou

AU - Ran, Wei

AU - Huang, Qiwei

AU - Shen, Qirong

AU - Tiedje, James M.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The effect of genetic markers and reference databases on analyses of fungal communities were estimated using fungal large subunit (LSU) and internal transcribed spacer (ITS) amplicon datasets in consecutive years of rhizosphere samples from three candidate biofuel crops, corn (Zea mays), switchgrass (Panicum virgatum), and miscanthus (Miscanthus × giganteus). These two marker genes were selected to contrast possible differences in biological conclusions. In addition, two ITS schemes based on two ITS reference databases were used to assess differences due to reference database composition. A taxonomy-supervised method was invoked using the Ribosomal Database Project naïve Bayesian classifier that accesses all three databases. The UNITE classification scheme had the highest number of classified taxa in the raw classification result; however, it also had the highest proportion of unknown taxa (sequences that were classified to “unclassified,” “unidentified,” incertae sedis or, in the case of Warcup, to matches containing two unique names). After removal of these unknown taxa, LSU had the highest classification rate followed by Warcup and UNITE. As expected, the communities resolved using the two ITS databases, based on the same sequences, were relatively more similar than those from the lower-coverage LSU classification scheme. The choice of marker gene or even the same reads with different classification databases revealed different community patterns due to database coverage, e.g., the relative abundance of the most abundant groups changed or were only detected in one or two of the classification schemes, such as for Mortierella, Fusarium, and Phoma. No marked difference in fungal beta-diversity was identified among the three methods. Differentiation between the three biofuel crops and between the drought and normal rainfall years was apparent, regardless of method. Though classification rates, taxonomic conflicts, and coverage differences within the high-abundance fungal groups varied according to classification scheme, there was no overall impact on beta diversity among the three methods.

AB - The effect of genetic markers and reference databases on analyses of fungal communities were estimated using fungal large subunit (LSU) and internal transcribed spacer (ITS) amplicon datasets in consecutive years of rhizosphere samples from three candidate biofuel crops, corn (Zea mays), switchgrass (Panicum virgatum), and miscanthus (Miscanthus × giganteus). These two marker genes were selected to contrast possible differences in biological conclusions. In addition, two ITS schemes based on two ITS reference databases were used to assess differences due to reference database composition. A taxonomy-supervised method was invoked using the Ribosomal Database Project naïve Bayesian classifier that accesses all three databases. The UNITE classification scheme had the highest number of classified taxa in the raw classification result; however, it also had the highest proportion of unknown taxa (sequences that were classified to “unclassified,” “unidentified,” incertae sedis or, in the case of Warcup, to matches containing two unique names). After removal of these unknown taxa, LSU had the highest classification rate followed by Warcup and UNITE. As expected, the communities resolved using the two ITS databases, based on the same sequences, were relatively more similar than those from the lower-coverage LSU classification scheme. The choice of marker gene or even the same reads with different classification databases revealed different community patterns due to database coverage, e.g., the relative abundance of the most abundant groups changed or were only detected in one or two of the classification schemes, such as for Mortierella, Fusarium, and Phoma. No marked difference in fungal beta-diversity was identified among the three methods. Differentiation between the three biofuel crops and between the drought and normal rainfall years was apparent, regardless of method. Though classification rates, taxonomic conflicts, and coverage differences within the high-abundance fungal groups varied according to classification scheme, there was no overall impact on beta diversity among the three methods.

KW - Fungal database

KW - ITS

KW - LSU

KW - UNITE

KW - Warcup

UR - http://www.scopus.com/inward/record.url?scp=85057007693&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057007693&partnerID=8YFLogxK

U2 - 10.1007/s00374-018-1331-4

DO - 10.1007/s00374-018-1331-4

M3 - Article

AN - SCOPUS:85057007693

JO - Biology and Fertility of Soils

JF - Biology and Fertility of Soils

SN - 0178-2762

ER -