TY - JOUR
T1 - High-resolution DNA-binding specificity analysis of yeast transcription factors
AU - Zhu, Cong
AU - Byers, Kelsey J.R.P.
AU - McCord, Rachel Patton
AU - Shi, Zhenwei
AU - Berger, Michael F.
AU - Newburger, Daniel E.
AU - Saulrieta, Katrina
AU - Smith, Zachary
AU - Shah, Mita V.
AU - Radhakrishnan, Mathangi
AU - Philippakis, Anthony A.
AU - Hu, Yanhui
AU - De Masi, Federico
AU - Pacek, Marcin
AU - Rolfs, Andreas
AU - Murthy, Tal
AU - Labaer, Joshua
AU - Bulyk, Martha L.
PY - 2009/4
Y1 - 2009/4
N2 - Transcription factors (TFs) regulate the expression of genes through sequence-specific interactions with DNA-binding sites. However, despite recent progress in identifying in vivo TF binding sites by microarray readout of chromatin im-munoprecipitation (ChIP-chip), nearly half of all known yeast TFs are of unknown DNA-binding specificities, and many additional predicted TFs remain uncharacterized. To address these gaps in our knowledge of yeast TFs and their cis regulatory sequences, we have determined high-resolution binding profiles for 89 known and predicted yeast TFs, over more than 2.3 million gapped and ungapped 8-bp sequences ("k-mers"). We report 50 new or significantly different direct DNA-binding site motifs for yeast DNA-binding proteins and motifs for eight proteins for which only a consensus sequence was previously known; in total, this corresponds to over a 50% increase in the number of yeast DNA-binding proteins with experimentally determined DNA-binding specificities. Among other novel regulators, we discovered proteins that bind the PAC (Polymerase A and C) motif (GATGAG) and regulate ribosomal RNA (rRNA) transcription and processing, core cellular processes that are constituent to ribosome biogenesis. In contrast to earlier data types, these comprehensive k-mer binding data permit us to consider the regulatory potential of genomic sequence at the individual word level. These k-mer data allowed us to reannotate in vivo TF binding targets as direct or indirect and to examine TFs' potential effects on gene expression in ∼1700 environmental and cellular conditions. These approaches could be adapted to identify TFs and cis regulatory elements in higher eukaryotes.
AB - Transcription factors (TFs) regulate the expression of genes through sequence-specific interactions with DNA-binding sites. However, despite recent progress in identifying in vivo TF binding sites by microarray readout of chromatin im-munoprecipitation (ChIP-chip), nearly half of all known yeast TFs are of unknown DNA-binding specificities, and many additional predicted TFs remain uncharacterized. To address these gaps in our knowledge of yeast TFs and their cis regulatory sequences, we have determined high-resolution binding profiles for 89 known and predicted yeast TFs, over more than 2.3 million gapped and ungapped 8-bp sequences ("k-mers"). We report 50 new or significantly different direct DNA-binding site motifs for yeast DNA-binding proteins and motifs for eight proteins for which only a consensus sequence was previously known; in total, this corresponds to over a 50% increase in the number of yeast DNA-binding proteins with experimentally determined DNA-binding specificities. Among other novel regulators, we discovered proteins that bind the PAC (Polymerase A and C) motif (GATGAG) and regulate ribosomal RNA (rRNA) transcription and processing, core cellular processes that are constituent to ribosome biogenesis. In contrast to earlier data types, these comprehensive k-mer binding data permit us to consider the regulatory potential of genomic sequence at the individual word level. These k-mer data allowed us to reannotate in vivo TF binding targets as direct or indirect and to examine TFs' potential effects on gene expression in ∼1700 environmental and cellular conditions. These approaches could be adapted to identify TFs and cis regulatory elements in higher eukaryotes.
UR - http://www.scopus.com/inward/record.url?scp=63849315606&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=63849315606&partnerID=8YFLogxK
U2 - 10.1101/gr.090233.108
DO - 10.1101/gr.090233.108
M3 - Article
C2 - 19158363
AN - SCOPUS:63849315606
SN - 1088-9051
VL - 19
SP - 556
EP - 566
JO - Genome Research
JF - Genome Research
IS - 4
ER -