TY - JOUR
T1 - Relationship between gene co-expression and sharing of transcription factor binding sites in Drosophila melanogaster
AU - Marco, Antonio
AU - Konikoff, Charlotte
AU - Karr, Timothy L.
AU - Kumar, Sudhir
N1 - Funding Information:
Funding: National Institutes of Health grant (to S.K.).
PY - 2009
Y1 - 2009
N2 - Motivation: In functional genomics, it is frequently useful to correlate expression levels of genes to identify transcription factor binding sites (TFBS) via the presence of common sequence motifs. The underlying assumption is that co-expressed genes are more likely to contain shared TFBS and, thus, TFBS can be identified computationally. Indeed, gene pairs with a very high expression correlation show a significant excess of shared binding sites in yeast. We have tested this assumption in a more complex organism, Drosophila melanogaster, by using experimentally determined TFBS and microarray expression data. We have also examined the reverse relationship between the expression correlation and the extent of TFBS sharing. Results: Pairs of genes with shared TFBS show, on average, a higher degree of co-expression than those with no common TFBS in Drosophila. However, the reverse does not hold true: gene pairs with high expression correlations do not share significantly larger numbers of TFBS. Exception to this observation exists when comparing expression of genes from the earliest stages of embryonic development. Interestingly, semantic similarity between gene annotations (Biological Process) is much better associated with TFBS sharing, as compared to the expression correlation. We discuss these results in light of reverse engineering approaches to computationally predict regulatory sequences by using comparative genomics.
AB - Motivation: In functional genomics, it is frequently useful to correlate expression levels of genes to identify transcription factor binding sites (TFBS) via the presence of common sequence motifs. The underlying assumption is that co-expressed genes are more likely to contain shared TFBS and, thus, TFBS can be identified computationally. Indeed, gene pairs with a very high expression correlation show a significant excess of shared binding sites in yeast. We have tested this assumption in a more complex organism, Drosophila melanogaster, by using experimentally determined TFBS and microarray expression data. We have also examined the reverse relationship between the expression correlation and the extent of TFBS sharing. Results: Pairs of genes with shared TFBS show, on average, a higher degree of co-expression than those with no common TFBS in Drosophila. However, the reverse does not hold true: gene pairs with high expression correlations do not share significantly larger numbers of TFBS. Exception to this observation exists when comparing expression of genes from the earliest stages of embryonic development. Interestingly, semantic similarity between gene annotations (Biological Process) is much better associated with TFBS sharing, as compared to the expression correlation. We discuss these results in light of reverse engineering approaches to computationally predict regulatory sequences by using comparative genomics.
UR - http://www.scopus.com/inward/record.url?scp=70349887776&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349887776&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btp462
DO - 10.1093/bioinformatics/btp462
M3 - Article
C2 - 19633094
AN - SCOPUS:70349887776
SN - 1367-4803
VL - 25
SP - 2473
EP - 2477
JO - Bioinformatics
JF - Bioinformatics
IS - 19
ER -