TY - JOUR
T1 - Genome reannotation of the lizard Anolis carolinensis based on 14 adult and embryonic deep transcriptomes
AU - Eckalbar, Walter L.
AU - Hutchins, Elizabeth D.
AU - Markov, Glenn J.
AU - Allen, April N.
AU - Corneveaux, Jason J.
AU - Lindblad-Toh, Kerstin
AU - Di Palma, Federica
AU - Alföldi, Jessica
AU - Huentelman, Matthew J.
AU - Kusumi, Kenro
N1 - Funding Information:
This work was supported by grants to K.K. (R21 RR031305 from the National Center for Research Resources and the Office of Research Infrastructure Programs (ORIP) of the National Institutes of Health; 1113 from the Arizona Biomedical Research Commission). This research was also supported by computational allocation from the Extreme Science and Engineering Discovery Environment (XSEDE) and the Arizona State University Advanced Computing Center (A2C2). Sequencing of strand-specific libraries was funded by NHGRI. We would like to thank the Broad Institute Genomics Platform for sequencing the strand-specific libraries. The authors would like to thank Dale DeNardo, Rebecca Fisher, Joshua Gibson, Tonia Hsieh, Rob Kulathinal, Alan Rawls, and Jeanne Wilson-Rawls for reviewing this manuscript.
PY - 2013/1/23
Y1 - 2013/1/23
N2 - Background: The green anole lizard, Anolis carolinensis, is a key species for both laboratory and field-based studies of evolutionary genetics, development, neurobiology, physiology, behavior, and ecology. As the first non-avian reptilian genome sequenced, A. carolinesis is also a prime reptilian model for comparison with other vertebrate genomes. The public databases of Ensembl and NCBI have provided a first generation gene annotation of the anole genome that relies primarily on sequence conservation with related species. A second generation annotation based on tissue-specific transcriptomes would provide a valuable resource for molecular studies.Results: Here we provide an annotation of the A. carolinensis genome based on de novo assembly of deep transcriptomes of 14 adult and embryonic tissues. This revised annotation describes 59,373 transcripts, compared to 16,533 and 18,939 currently for Ensembl and NCBI, and 22,962 predicted protein-coding genes. A key improvement in this revised annotation is coverage of untranslated region (UTR) sequences, with 79% and 59% of transcripts containing 5' and 3' UTRs, respectively. Gaps in genome sequence from the current A. carolinensis build (Anocar2.0) are highlighted by our identification of 16,542 unmapped transcripts, representing 6,695 orthologues, with less than 70% genomic coverage.Conclusions: Incorporation of tissue-specific transcriptome sequence into the A. carolinensis genome annotation has markedly improved its utility for comparative and functional studies. Increased UTR coverage allows for more accurate predicted protein sequence and regulatory analysis. This revised annotation also provides an atlas of gene expression specific to adult and embryonic tissues.
AB - Background: The green anole lizard, Anolis carolinensis, is a key species for both laboratory and field-based studies of evolutionary genetics, development, neurobiology, physiology, behavior, and ecology. As the first non-avian reptilian genome sequenced, A. carolinesis is also a prime reptilian model for comparison with other vertebrate genomes. The public databases of Ensembl and NCBI have provided a first generation gene annotation of the anole genome that relies primarily on sequence conservation with related species. A second generation annotation based on tissue-specific transcriptomes would provide a valuable resource for molecular studies.Results: Here we provide an annotation of the A. carolinensis genome based on de novo assembly of deep transcriptomes of 14 adult and embryonic tissues. This revised annotation describes 59,373 transcripts, compared to 16,533 and 18,939 currently for Ensembl and NCBI, and 22,962 predicted protein-coding genes. A key improvement in this revised annotation is coverage of untranslated region (UTR) sequences, with 79% and 59% of transcripts containing 5' and 3' UTRs, respectively. Gaps in genome sequence from the current A. carolinensis build (Anocar2.0) are highlighted by our identification of 16,542 unmapped transcripts, representing 6,695 orthologues, with less than 70% genomic coverage.Conclusions: Incorporation of tissue-specific transcriptome sequence into the A. carolinensis genome annotation has markedly improved its utility for comparative and functional studies. Increased UTR coverage allows for more accurate predicted protein sequence and regulatory analysis. This revised annotation also provides an atlas of gene expression specific to adult and embryonic tissues.
KW - Annotation
KW - Anolis carolinensis
KW - Embryo
KW - Gene
KW - Genome
KW - Lizard
KW - RNA-Seq
KW - Tissue-specific
KW - Transcriptome
KW - Vertebrate
UR - http://www.scopus.com/inward/record.url?scp=84872516192&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84872516192&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-14-49
DO - 10.1186/1471-2164-14-49
M3 - Article
C2 - 23343042
AN - SCOPUS:84872516192
SN - 1471-2164
VL - 14
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 49
ER -