Abstract
Background: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. Methods: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan. Result: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. Conclusion: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots.
Original language | English (US) |
---|---|
Article number | 396 |
Journal | BMC bioinformatics |
Volume | 23 |
DOIs | |
State | Published - Mar 2022 |
Keywords
- 3′-End-seq
- Alternative polyadenylation
- RNA-seq
- Transcriptome
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics
Fingerprint
Dive into the research topics of 'APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data'. Together they form a unique fingerprint.Datasets
-
APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data
Fahmi, N. A. (Creator), Ahmed, K. T. (Creator), Chang, J. (Creator), Nassereddeen, H. (Creator), Fan, D. (Creator), Yong, J. (Creator), Zhang, W. (Creator), Yong, J. (Creator) & Zhang, W. (Creator), Figshare, 2022
DOI: 10.6084/m9.figshare.c.6223183.v1, https://springernature.figshare.com/collections/APA-Scan_detection_and_visualization_of_3_-UTR_alternative_polyadenylation_with_RNA-seq_and_3_-end-seq_data/6223183/1
Dataset
-
APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data
Fahmi, N. A. (Creator), Ahmed, K. T. (Creator), Chang, J. (Creator), Nassereddeen, H. (Creator), Fan, D. (Creator), Yong, J. (Creator), Zhang, W. (Creator), Yong, J. (Creator) & Zhang, W. (Creator), Figshare, 2022
DOI: 10.6084/m9.figshare.c.6223183, https://springernature.figshare.com/collections/APA-Scan_detection_and_visualization_of_3_-UTR_alternative_polyadenylation_with_RNA-seq_and_3_-end-seq_data/6223183
Dataset