New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Hierarchical mechanisms build the DNA-binding specificity of FUSE binding protein
-
Edited by Sankar Adhya, National Institutes of Health, Bethesda, MD, and approved September 30, 2008 (received for review April 3, 2008)

Abstract
The far upstream element (FUSE) binding protein (FBP), a single-stranded nucleic acid binding protein, is recruited to the c-myc promoter after melting of FUSE by transcriptionally generated dynamic supercoils. Via interactions with TFIIH and FBP-interacting repressor (FIR), FBP modulates c-myc transcription. Here, we investigate the contributions of FBP's 4 K Homology (KH) domains to sequence selectivity. EMSA and missing contact point analysis revealed that FBP contacts 4 separate patches spanning a large segment of FUSE. A SELEX procedure using paired KH-domains defined the preferred subsequences for each KH domain. Unexpectedly, there was also a strong selection for the noncontacted residues between these subsequences, showing that the contact points must be optimally presented in a backbone that minimizes secondary structure. Strategic mutation of contact points defined in this study disabled FUSE activity in vivo. Because the biological specificity of FBP is tuned at several layers: (i) accessibility of the site; (ii) supercoil-driven melting; (iii) presentation of unhindered bases for recognition; and (iv) modular interaction of KH-domains with cognate bases, the FBP-FIR system and sequence-specific, single-strand DNA binding proteins in general are likely to prove versatile tools for adjusting gene expression.
Single-stranded DNA results from several processes. Telomeres require single-stranded overhangs for telomerase and other telomere maintenance factors (1). ssDNA is also transiently produced by replication, DNA repair, recombination, and transcription. During transcription, because template is threaded through RNA polymerase, dynamic negative supercoils that melt DNA at susceptible upstream sites are generated (2, 3). Sequence-specific and nonspecific protein binding to ssDNA is important for the physiological regulation of genetic processes. Because single-strand binding proteins are often composed of loosely articulated nucleic acid binding modules, and because of the flexibility of ssDNA, target site selection by these proteins must be a dynamic process.
Whereas some ssDNA-binding proteins, e.g., SSB in prokaryotes or Replication Protein A in eukaryotes, have little or no sequence specificity (4), other ssDNA-binding proteins are sequence-specific. Sequence-specificity is often conferred by hnRNP K-Homology (KH) domains, RNA Recognition Motifs (RRMs), or oligosaccharide/oligonucleotide binding (OB) folds (5–7). FUSE Binding Protein (FBP) is a transcription factor that binds ssDNA through 4 KH domains (8).
FBP binds to the Far Upstream Element (FUSE), an A/T-rich element located 1.7 kb upstream of the c-myc P2 promoter (9–11), and modulates its transcription. A molecular machine composed of FUSE, FBP, the FBP Interacting Repressor (FIR), and TFIIH integrates multiple inputs (9) to set the output of the proto-oncogene. Upon c-myc activation, SWI/SNF displaces a FUSE-masking nucleosome (9); with the onset of transcription, incipient torsional strain melts FUSE, allowing FBP to bind the “bottom” strand of FUSE, presumably through all 4 of the protein's KH domains (9, 12), and then activate transcription through TFIIH. Next, FIR, recruited by FBP-FUSE, represses transcription, delimiting the pulse of c-myc expression (9). Declining supercoiling rapidly releases FBP from binding with FUSE, but FIR dissociates much more slowly (9, 13).
In vivo and in vitro binding studies have identified the span along the bottom strand of FUSE that binds FBP and have indicated the orientation of the bound protein (11, 14). Although Braddock et al. used NMR to solve the solution structure of an FBP fragment, KH-domains 3 + 4, bound to a short segment of FUSE (15), the sequence contacts and structural context for DNA recognition by full length FBP across the full FUSE region remain unexplored, and less is known about the interactions of FBP and FIR with other genomic targets.
It is not known how the flexibility of ssDNA and the peptides articulating the KH domains influence DNA binding and site selection by FBP, and it is not known whether multiple KH domains engage DNA stably and concurrently or whether there is dynamic positive or negative interplay between them (5, 15). It is also not known whether FIR binding to the FBP-DNA complex changes FBP's interactions with ssDNA as has been hypothesized (16).
These experiments show how full-length FBP contacts FUSE in 4 blocks spanning a region of 30–32 bases, and that FIR modifies this pattern. A SELEX assay adapted for single-stranded DNA yielded “T(T/C)GT” as the optimal binding sequence for KH domains 2 through 4, with KH1 preferring “(T/G)TG(T/C).” Based on this information, 2 “T” to “A” mutations were engineered that disabled FUSE-mediated transcription activation in vivo. Combined with in silico analysis of supercoil-driven melting, these data allow analysis of genomic FBP binding and function, and provide a framework for the study of other ssDNA-binding proteins.
Results
FBP and Its KH-Domains Do Not Read Sequence Stringently.
Previous studies with FUSE; i.e., the 3-d NMR structure of FBP(KH3+KH4) binding to a FUSE subsequence (15); assumed that the DNA binding by FBP fragments mirror the specificity of the full-length protein. To compare DNA binding by full-length FBP versus FBP(KH3+KH4), a series of 27-mer oligonucleotide probes (Fig. 1A) scanning through FUSE were tested for protein-binding by electrophoretic mobility shift assay (EMSA) (Fig. 1B). FBP(KH3+KH4) bound a subset of the probes bound by the full-length protein (probe 2 binds the latter but not the former) (Fig. 1C). Surprisingly, although probes 1 and 2 each included sequences recognized by KH3 and KH4 (5) in the reported structure, they shifted less well than probes 3–5 that included only the KH3-binding segment and lacked the KH4-binding sequences (Fig. 1B). Moreover, probe 6, lacking the contacts seen in the structure, still bound efficiently with both proteins. With the same panel of subsequences, FBP(KH2+KH3) showed binding to a much more restricted subset of probes than full-length FBP; this subset was distinct but overlapping with the probes bound by FBP(KH3+KH4) (Fig. 1B). Straightforward rules explaining FBP target recognition could not be inferred from these data.
EMSA scanning of FBP binding with FUSE oligonucleotides. (A) The eight 27-mer single-stranded probes used in this gel-mobility shift assay are aligned with the FUSE region upstream of the c-myc P2 promoter. The KH3 element (5′-ATTTTT-3′) and the KH4 element (5′-TATTCC-3′) reported from the NMR study (5) are shown as boldface italic. (B) A total of 0.5 pmol of each FUSE probe was incubated with 5 pmol of each of affinity-purified GST-FBP(KH2+KH3) or GST-FBP(KH3+KH4), or with 1.25 pmol of full-length FBP. Binding reactions lacking protein are shown at Left. (C) A diagram showing how FBP's KH domains are situated in the full-length protein.
These results raised several issues concerning target site selection by FBP. Sequence specificity might be slight, so that the principle determinant of FBP selectivity in vivo is the accessibility, size, and persistence of single-stranded zones. In contrast, each KH-domain might have a hierarchy of preferred sequences, so minimizing the sum of the binding free energies for each KH-domains sets the net affinity with each probe. In the latter case, each KH-domain might accept a suboptimal sequence to allow another KH-domain to settle upon an energetically compensatory site. In this situation it is important to illuminate the spectrum of acceptable KH-binding sequences and to define the rules governing their spacing and arrangement. In another scenario, binding sites are defined by protein-backbone contacts with alternative DNA structures; FBP-binding would be indirectly sequence-specific insofar as the sequence defines the alternative structure. Thus, we sought to tease apart the roles of sequence, arrangement, and DNA conformation in FBP-DNA interactions.
Selection of FBP-Binding Consensus Sequences.
SELEX (Systematic Evolution of Ligands by Exponential Enrichment) is an unbiased method to elucidate the sequence preferences of the binding specificities of dsDNA- and RNA-binding proteins (17, 18). As diagramed in supporting information (SI) Fig. S1, we adapted SELEX for ssDNA-binding proteins such as FBP. To provide a platform to amplify selected sequences while minimizing interference from PCR primer binding sites, 22 random nucleotides were sandwiched between imperfect hairpins designed to fold under selection conditions but to bind primer during PCR (19). Hairpin formation occludes KH-domain binding. Comparison of the electrophoretic mobility of similar length oligonucleotides with and without these imperfect hairpins confirmed hairpin folding during EMSA (Fig. S2).
Prior experiments in vivo and in vitro have suggested that FBP contacts a 40-nucleotide segment of FUSE. Because it is impossible to synthesize a pool of all possible 40 mers, we elected not use full-length FBP for selection. SELEX using contiguous pairs of KH domains (1 + 2, 2 + 3, and 3 + 4) yielded the results shown in Fig. 2. Each KH-pair enriched a highly nonrandom set of sequences. The tetrads “TTGT” and “TCGT” were highly over-represented and are shown in red and blue, respectively. TTGTs were especially highly represented at the 5′-end of sequences selected by (KH2+KH3). Selection by (KH1+KH2), the weakest binding pair of domains (20), enriched “GTGT” and “GTGC” (both green). Adjacent to the randomized segment of the selection template, rare mutations in the primer/primer binding sequences likely represented oligonucleotide synthesis or PCR errors.
SELEX-based enrichment of binding sites in ssDNA. (Upper) Diagram of method. (Lower) Sequences selected by paired KH domains. (KH3+KH4) Sequences of the input degenerate oligonucleotide and “winners” after 7 rounds of binding to GST-FBP(KH3+KH4). (KH2+KH3) Winners after 6 rounds of binding to GST-FBP(KH2+KH3). (KH1+KH2) Winners after 4 rounds of binding to GST-FBP(KH1+KH2). Mutations in the nonrandomized flanking sequence are shown in lowercase; deletions are indicated with dashes. Some output sequences were longer than the 22 bases randomized. Selected tetrads TTGT are red, TTGT are blue, GTGT and GTGC are green ([KH1+KH2] only); overlapping tetrads are purple or yellow.
Analysis of the (KH3+KH4) and (KH2+KH3) selected sequences with the program “Consensus” (21) revealed a weak preference for As at a fifth position expanding the tetrad to a quintet. The SELEX data alone did not assign unique optimal binding sites to individual KH domains. However, the enrichment of TGTG or GTGT only with (KH1+KH2), but not with (KH2+KH3) or (KH3+KH4), suggests that KH1 binds (g)TGTG(t). The sequences TTGT and TCGT likely bind well with KH-domains 2, 3, and 4.
All of the selected sequences were highly enriched for T's (Table 1). This prevalence is not explained by composition of the consensus tetrads; excluding the tetrads, the residual sequences are still T-rich. Nor is the excess due to FBP binding stably to strings of T's; FBP does not bind to oligo(dT) (data not shown). Embedding the KH-cognate elements in T-rich DNA may create a molecular environment conducive for recognition (22).
Nucleotide and tetranucleotide frequency in SELEX output
Spacing Between KH Domain-Binding Sequences Tunes FBP Binding.
SELEX did not reveal the preferred spacing between KH domain-binding tetrads. Based on the flexibility seen in the solution structure of FBP(KH3+KH4) and the lower rigidity of ssDNA than dsDNA, there was no a priori prediction for the spacer lengths separating the selected sequences. To determine the effect of spacer-length, a series of oligonucleotides (Fig. 3A) with variable separation between consensus elements was tested by competition against a 29-base FUSE probe for binding FBP fragments (SI Methods). Because A was by far the least enriched base in the SELEX assays (Table 1), KH-binding sequences were embedded in oligo-dA to maintain the overall length while minimizing confounding interactions with the spacer or flanking sequences.
Determination of the Optimal Spacing Between KH-domains by EMSA. (A) Sequence of spacer oligonucleotides. (B and C) A standard oligonucleotide was end-labeled and gel-shifted by an excess of (B) GST-FBP(KH3+KH4) (compare lanes 1 and 2) or (C) GST-FBP(KH2+KH3). Unlabeled competitor oligonucleotides (17 μM) in excess relative to the protein (≈40 nM) were added to most binding reactions (lanes 3–16), before protein. “Spacer4,” (lane 8) competed most effectively among the spacer oligonucleotides (lanes 6–14). All spacer oligonucleotides competed for protein binding more effectively than control oligonucleotides containing a single (lanes 4, 5, 15, and 16) or no (lane 3) consensus elements. Shifted bands were quantified by Imagequant software, version 1.2, and charted in Fig. S3.
The results of this assay were similar using (KH3+KH4) (Fig. 3B) or (KH2+KH3) (Fig. 3C). Competition occurred over a range of spacer-lengths, but was most efficient at 4- or 5-bases (Fig. 3B, lanes 9 and 10; also Fig. 3C, lanes 8 and 9; Fig. S3). Oligonucleotides with single KH-binding sequences competed inefficiently (Fig. 3B, lanes 4, 5, 15, and 16; Fig. 3C, lanes 4, 5, 14, and 15), arguing that high affinity protein binding required both embedded SELEX sequences. Surprisingly, the no-spacer oligonucleotide (Spacer0) still competed efficiently with the probe for binding (Fig. 3B, lane 6). The structures of DNA-engaged FBP(KH3+KH4) and other KH-domain-nucleic acid complexes do not allow 2 KH-domains to simultaneously engage immediately juxtaposed pentameric sequences; therefore, Spacer0 must either toggle between KH-domains by sliding, transiting through out-of-register intermediates, or some unanticipated conformational change must allow simultaneous cooccupancy by both KH-domains. When embedded in T's, pairs of KH-binding sequences bound FBP more strongly and were relatively insensitive to spacing (data not shown). These results show that spacer composition and length tune FBP binding.
FBP Is Positioned at FUSE Through SELEX-Like Sequences.
Together the SELEX results and EMSA analysis of FUSE subsequences suggest that FBP KH domains 2–4 each interact with a similar set of partially degenerate pentamers, and KH1 recognizes a different set of pentamers. The local nucleotide composition and spacer distances modify the net binding affinity.
FBP may bind in many registers to a series of energetically degenerate, overlapping and nested binding sites. To determine whether FBP has a preferred register within the FUSE region, and to identify which FUSE-bases were important to bind full-length FBP, a missing-base interference assay was developed similar to previous techniques using depurinated/depyrimidinated probes to identify contacted bases (23). This assay (Fig. S4) used a single end-labeled 87-mer FUSE oligonucleotide with random abasic sites occurring on average once per molecule (SI Methods).
Oligodeoxynucleotides competent to bind FBP were recovered from protein-DNA complexes separated by EMSA from unbound/incompetent probe (Fig. 4A Left), cleaved at abasic sites, and compared on sequencing gels (Fig. 4B). The FBP-bound probe (lanes 2 and 3) showed 2 regions of strong interference bracketing 2 regions of lesser interference. The 4 regions of interference matches the number of KH domains in FBP. These regions were approximately evenly spaced, and the total span of all contacts was 30–32 bases. If each KH-domain bound 5-bases (as suggested by SELEX), and each region is separated by 4 or 5 bases (optimal), then the span of a FUSE footprint (four pentamers and 3 spacers) would be 32–35 nt. Note that both regions of strong interference included the triplet “TGT” present in all of the SELEX consensus sequences; moreover, the most upstream contacts occurred within the sequence “TGTGT” highly enriched by KH1 in SELEX reactions. The weaker regions lacking TGT were poorly delimited as if the internal KH-domains (KH2+KH3) were more mobile on the DNA than the bracketing domains (KH1, KH4).
Missing-base interference footprinting. (A) Preparative gel shift of FBP-FUSE (Left) (3 μmol of FBP; ≈0.1 μmol of probe), and FIR-FUSE or FIR-FBP-FUSE (Right) (0.5 μmol of FIR ± 29 nmol of FBP; ≈0.1 μmol of probe). (Left) A longer exposure than shown at Right. Probe was recovered from all bands and processed through the remainder of the assay (SI Methods). (B) Interference footprint of FBP on FUSE. Compare samples that bound FBP (lane 3) with free (lane 4) and input (lane 2) DNA. Regions with strong interference (thick bars) and weak interference (thin bars) are shown on both the gel and the sequence (5′ to 3′ starting at the bottom). No other interfering regions were seen on a 20% gel (data not shown). (C) Interference footprints of FIR alone (annotated in red, lane 3) and FIR+FBP (green, lane 6) on FUSE 87-mer; FBP unsupershifted by FIR is also shown (blue, lane 5). The EMSAs for lanes 2–3 and lanes 5–7 used the same input probe (lane 4). Reinforcing (“hypersensitive”) sites are marked with bullets. The same sample was run on 7% (Left) and 20% (Right) denaturing PA gels. (D) Comparison of footprints. Strongly footprinted regions from B are shown in gray.
Based on contact points, SELEX and spacing analyses, we surmise the optimal sequence for full-length FBP to be: TTGTa(N)4/5TYGTa(N)4/5TYGTa(N)4/5KTGY (Y = T or C; K = T or G).
FIR Modifies FBP's Interaction with FUSE.
FIR represses transcription after binding with FBP and FUSE (9), although in vitro FIR binds with FUSE weakly. FBP augmented FIR binding (Fig. S5), presumably through their previously described protein–protein interaction (20, 24). Missing-base interference identified nucleic acid contact points distinguishing FBP-FUSE, FIR-FUSE, and FBP-FIR-FUSE complexes (Fig. 4 A Right and C). 4C, lane 3 (both gels) shows the interference pattern of FIR alone; this footprint enveloped FBP's and consisted of 3 regions of interference, 1 entirely 5′ of FBP's footprint (Fig. 4D). Most of the contacted bases were Ts. Interestingly, “reinforcement” was seen at several purines (bullets). The increased the stability of FIR-FUSE complexes missing these purines indicate steric hindrances with these bases or alternatively, FUSE may form a FIR-nonbinding structure requiring these purines to fold.
FBP-FIR-FUSE (ternary) complexes were separated by EMSA from the FBP-FUSE (binary) complexes (Fig. 4A Right).
Surprisingly, the footprint of the ternary complex (Fig. 4B, lanes 6) was contracted relative to the FIR-alone footprint (Fig. 4B, lane 3). Although the upstream, i.e., 3′, FBP+FIR boundary was extended slightly, the strong downstream FIR contacts were entirely lost. Apparently, FBP provoked either a major reorganization or repositioning of FIR on FUSE. Equally surprising, the contacts within the binary complexes formed in the presence of FIR (Fig. 4B, lane 5) were distinguishable from those of the binary complexes formed in its absence (Fig. 4C). These differences were not easily explained by combination or superimposition of the FBP-FIR-FUSE and FBP-FUSE (in the absence of FIR) footprints. We speculate that dissociation of FIR from an FBP-FIR-FUSE ternary complex yielded a metastable FBP-FUSE complex that slowly relaxed to the FIR-independent, FBP-FUSE ground state.
Site Selectivity of Full-Length FBP on Full-Length FUSE.
Tight binding of FBP to FUSE might reflect the summation of binding at multiple degenerate sites, or alternatively, binding at 1 or 2 predominant sites. In the case of multiple, equivalent sites, the FBP-contact point analysis would reveal the overlap of all registers, and (i) maximal overlap would be expected in the central region of FUSE, (ii) shortening the probe would decrease the number of contributing frames at the ends and reduce overall binding, (iii) single point mutations would leave most frames competent and marginally reduce overall binding, and (iv) mutations within the central portion of the FUSE region where more degenerate binding sites overlap would attenuate binding more than mutations at the fringes of the melted zone, where there would be fewer sites because of the dsDNA-ssDNA boundary. If, however, FBP binds mainly at a single site, then (i) maximum protection would occur within the predominant site at critical bases to bind KH-motifs, (ii) shortening the probe would have little effect unless encroaching on that site, and (iii) point mutations of the critical bases within the predominant site should dramatically lessen binding.
A complex formed between FBP and an 87-mer FUSE probe including the entire melting segment was challenged with a 47-mer competitor that included all of the bases contacted by FBP or by cold 87-mer (SI Methods). Because the 87-mer and 47-mer were similarly effective competitors (data not shown), the longer probe must lack additional good FBP binding sites. These data show that although FBP binds with most short overlapping probes from FUSE by EMSA, when presented with the entire region, FBP settles on a single primary site.
In Vivo Test of FUSE Contacts and an Optimal FBP Binding Sequence.
To test the in vivo relevance of the predominant FBP binding site defined in vitro, episomes with wild type or mutant FUSE elements sandwiched between divergently transcribing metallothionein promoters were used to drive EGFP expression. This arrangement insures sufficient transcriptional supercoiling to melt FUSE upon the addition of Zn2+s. The wild-type FUSE-episome, pMT2(FUSE) has been demonstrated to bind FBP and FIR in vivo and to program a pulse of reporter expression (9). The mutant episome, pMT2(mutFUSE) (Fig. 5A), contained 2 T-to-A FUSE mutations, 1 in each of the 2 major contacted regions collinear with KH1 and KH4 described above. This double mutation decreased FBP-binding by >15-fold by EMSA (Fig. 5B, compare lane 2 to lane 8). The T-to-A transversions preserve AT-content and the melting profiles of both were calculated to be indistinguishable using the highly reliable Stress-Induced Duplex Destabilization (SIDD) algorithm (Fig. S6) (25).
Double point mutation in FUSE reduces FBP binding in vitro and FUSE function in vivo. (A) Schematic of pMT2 (+FUSE) with pMT2(mutFUSE) mutation sites indicated (partial FUSE sequence is shown). In pMT2(-FUSE), bacterial sequence replaces FUSE to maintain the length. (B) In vitro competition for FBP binding by (wt)FUSE 87-mer and (mut)FUSE 87-mer oligonucleotides. A small amount of end-labeled wtFUSE87-mer was gel-shifted by an excess of FBP (compare lanes 1 and 11). Included in the binding reactions were increasing amounts: 1× (> [FBP]), 3×, 10×, or 30× unlabeled wtFUSE87-mer (lanes 2–5); or 1×, 3×, 10×, 30×, or 100× unlabeled mut-FUSE87-mer (lanes 6–10). Comparing the quantified (ImageQuant) shifts in lanes 2 and 8, wtFUSE binds FBP ≈15-fold more strongly than mut-FUSE87-mer. (C) Histograms of flow cytometry testing FUSE function. EGFP expression was measured 10 h after induction with 60 μM Zn2+ (x axis). The histogram for uninduced pMT2(FUSE) is overlaid in green.
Raji cells were transfected with pMT2(-FUSE), pMT2(+FUSE), or pMT2(mutFUSE) and transcription from the divergent metallothionein (MT) promoters induced with Zn2+ torsionally strained the upstream DNA (2). MT-IIA promoter-driven EGFP was monitored by flow cytometry (Fig. 5C). As reported, cells with pMT2(+FUSE) expressed much more EGFP than those with pMT2(-FUSE) (9). EGFP-expression from pMT2(mutFUSE) was indistinguishable from pMT2(-FUSE), showing that mutation of 2 key FBP contacts disabled the FUSE-effect.
To test the function of an optimal FBP-binding sequence, FUSE was swapped for a perfect consensus in the pMT2 episomes (SI Methods) and transfected into Raji cells. The reporter genes were expressed even without induction (Table S1), and were further induced with low levels of Zn2+. The increased potency of the consensus was expected because of tighter binding by FBP and because the consensus by SIDDs, melts at lower levels of supercoiling than FUSE.
Discussion
Upon c-myc induction, FBP is recruited to supercoil-destabilized FUSE, boosts transcription and recruits FIR. After FIR represses transcription, the level of dynamic supercoiling falls, FUSE reanneals and FBP is released thus a pulse of expression is generated (9, 14, 24, 26). If such pulse generation operates at other genes, revealing the chemical and structural principles that govern FBP binding at FUSE may facilitate prediction of other FBP targets.
How the net specificity of FBP's 4 KH domains is set has not been completely determined. The NMR structure of a pair of FBP's KH domains (KH3+KH4) bound to short FUSE oligonucleotides shows a disordered linker between KH3 and KH4 that was predicted to permit variable spacing between bound sequences because of the flexibility of both the protein and the ssDNA (5). The sensitivity of FBP binding to ssDNA to both the length and base composition of the spacers is loose enough to allow FUSE-like elements to overlap or interdigitate with other regulatory sequences in a variety of contexts. Thus, the functional properties of FBP binding sites are likely to be molded by the nature and arrangement of nearby cis-elements and by the local chromatin environment.
The 4 short regions of contact between FBP and FUSE identified with missing-base interference match the number of KH domains in FBP. The outer, strongly footprinted regions each consisted of the sequence TGT, and SELEX yielded TGT as the core of the ideal binding sites for each of FBP's KH domains. The 2 interior, weakly footprinted regions likely indicate binding by KH2 and KH3 in multiple registers. The outermost regions bind KH1 and KH4 to “bookend” the binding site and fix the location of FBP on fully melted FUSE (Fig. S7). The region of FUSE bound by KH1 may be particularly important for FBP-FIR function, because this is the primary locale distinguishing the FBP-FIR-FUSE and FBP-FUSE footprints. FIR-FBP-FUSE ternary complexes contact more DNA than FBP-FUSE, but less than FIR-FUSE. Crichlow et al. (16) found that FIR binds to a short FUSE oligonucleotide as a dimer, and propose a model where FIR first binds to FBP-FUSE and then evicts FBP by looping or spooling of the DNA. The extended contacts seen with FIR-alone, but not by FBP+FIR, is compatible with such a structural reorganization.
Ts were highly selected by the paired KH domains even outside of the consensus tetrads. Two nonexclusive hypotheses may explain this enrichment: first, the runs of T's may allow weak binding in multiple registers insufficient to be detected by EMSA (oligo-“dT” is not shifted in EMSA); second, T-richness may allow specific base recognition by FBP by minimizing interferring secondary DNA structures. (Poly-dT is the least structured of any homopolymeric ssDNA) (22).
That c-myc FUSE deviates from a perfect consensus, but is bound by FBP during activated transcription in vivo, shows that binding sites need not conform to the optimal sequence to recruit FBP. Indeed no perfect FBP binding site is found in the human genome where a 26/32 match would be expected to approximately once. Because the c-myc FUSE is a 19/32 match we expect that there may be sites in the genome that bind FBP much more strongly. Such a higher-affinity FBP-binding site (in vitro and in vivo) that is also a stronger positive element than FUSE has been discovered in the USP29 promoter and is a 22/32 match (J.L. and D.L., unpublished data). Such tight binding sites in vivo may contribute to the slower mobility of FBP in vivo compared with other transcription regulators (27). Future studies will determine where FBP binds most stably within the genome, and what functions might be associated with these sites.
Deviations of FUSE from the ideal FBP-binding sequence may tune the level of supercoiling required to recruit, hold or release FBP. Unlike highly sequence specific proteins where binding is essentially a binary decision (bind/not bind), FBP interacts with a wide range of sequences across a wide range of affinities. A FUSE-like sequence in a torsionally susceptible region becomes dynamically and progressively destabilized as supercoiling increases. Insertion of an FBP KH-domain into a breathing segment of DNA would prop that segment open and promote further melting that would engage the next KH domain, and so forth. Thus, FBP function at a binding sites would be modified by the match of the sequence with the consensus for each of the KH domains, by the base composition of the flanking and interdigitating sequences and by the melting potential of the entire region. In principle, variation of these 3 parameters would tune a broad spectrum of FBP-responsiveness as demonstrated using FUSE and optimal FBP binding site driven reporters here.
FUSE may be the prototype for a class of promoter elements directing an FBP-driven pulse of transcription. Genomic discovery of FUSE-like promoter elements, and other FBP-binding sites, will be aided by knowledge of the consensus sequence and spacing constraints detailed here. FBP binding requires DNA melting. SIDD predictions offer one approach to find supercoil destabilized sequences (28). These sequences may then be screened for consensus sequences to predict where FBP may bind. Accurate predictions will require definition of topological boundaries and knowledge of chromatin structure. Potential FBP-binding sites near strong promoters to generate dynamic supercoils may be especially good candidates for evaluation by chromatin-immunoprecipitation and/or footprinting methods.
Materials and Methods
Proteins.
GST (Glutathione S-Transferase)-tagged FBP fragments from bacteria were bound to Glutathione-Sepharose 4B (Amersham Pharmacia), washed with ATP at 37 °C, and eluted with glutathione. Full-length (His-tagged) FBP and FIR were expressed by baculovirus in Sf9 cells and purified by (Ni2+) chromatography.
Electrophoretic Mobility Shift Assay (EMSA).
Binding reactions (10–12% glycerol; 20 mM Tris·HCl, pH 8.0; 1 mM EDTA; 100 mM KCl; 0.5 mg/ml BSA; 20–100 μg/ml polyd(I-C); 1 mM DTT) were incubated at room temperature, unless otherwise indicated, and loaded onto native 6% or 8% polyacrylamide gels prerun for 30 min. Analytical gels were dried and exposed to film. Bands were scanned, and then quantified using IQ Mac v1.2 (ImageQuant) software.
SELEX.
The “imperfect hairpin template” (Fig. 2 and SI Methods), was PCR-amplified using primers: Biotin-5′-GCAGTCCCGTTTCGCGAGTGC-3′ and 5′-ACGGTACCCCGTAGCGT-3′ with annealing at 65 °C and cleaned up by phenol/chloroform extraction, ethanol precipitation, and gel-purification. Recovered DNA was labeled with T4 Polynucleotide Kinase (NEB) and immobilized on streptavidin-Dynabeads (Dynal). After magnetic separation/washing, the unbiotinylated strand was eluted with 0.15 M NaOH.
Preparative EMSA (6% polyacrylamide gel; 100 ng/μl polyd(I-C) in binding reactions) used ≈2-ng GST-tagged pairs of KH domains and enough ssDNA that <5% was shifted, i.e., >1 pmol of input. The ssDNA, etc. were heated to 75 °C, then placed on ice before addition of protein. Shifted DNA was recovered from the gel (Suppl. Methods), and reamplified under the same conditions. After 4–7 rounds of selection, unbiotinylated primers were used for the final amplification. These products were phosphorylated, ligated into short chains, phosphorylated again, ligated into HincII-digested pUC19, and transformed into E. coli DH5α. The MCS and included inserts were sequenced.
Flow Cytometric Comparison of Wild-Type and Mutant FUSE Function.
FUSEmut sequence was substituted for FUSE in the creation of pMT2+ (9). Raji cells were transfected with this pMT2(mutFUSE) plasmid or its wtFUSE or (−)FUSE counterparts by electroporation (Amaxa Biosystems electroporator and Cell Line Nucleofector Kit V, using the manufacturer's instructions and settings for Raji cells). Cells were cultured and selected in RPMI medium 1640 + 10% FBS + 125 μg/ml HygroGold (Invitrogen). Metallothionein promoters were induced with 60 μM ZnSO4, and fluorescence was measured 10 h after induction using a FACScan (Becton Dickenson) flow cytometer.
Acknowledgments
We thank the Levens Laboratory, in particular Z. Nie for assistance; G. Stormo for discussions of secondary structure minimization in T-rich DNA; M. Clore for helpful discussions of protein-DNA flexibility; and B. Lewis, D. Singer and K. Zhao for critical comments. This work was supported by the Intramural Research Program of the National Institutes of Health, National Cancer Institute, Center for Cancer Research.
Footnotes
- 1To whom correspondence should be addressed. E-mail: levens{at}helix.nih.gov
-
Author contributions: L.R.B., H.-J.C., F.K., J.L., and D.L. designed research; L.R.B., H.-J.C., S.S., and F.K. performed research; L.R.B., H.-J.C., S.S., and J.L. contributed new reagents/analytic tools; L.R.B., H.-J.C., and D.L. analyzed data; and L.R.B. and D.L. wrote the paper.
-
The authors declare no conflict of interest.
-
This article is a PNAS Direct Submission.
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0803279105/DCSupplemental.
References
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Johnston SD,
- Lew JE,
- Berman J
- ↵
- Mitton-Fry RM,
- Anderson EM,
- Hughes TR,
- Lundblad V,
- Wuttke DS
- ↵
- Duncan R,
- Collins I,
- Tomonaga T,
- Zhang T,
- Levens D
- ↵
- ↵
- ↵
- Michelotti GA,
- et al.
- ↵
- Duncan R,
- et al.
- ↵
- ↵
- ↵
- ↵
- ↵
- Blackwell TK,
- Weintraub H
- ↵
- ↵
- Zuker M,
- Mathews D,
- Turner D
- ↵
- Chung HJ,
- et al.
- ↵
- Lim HA
- Hertz GZ,
- Stormo GD
- ↵
- Ts'o POP,
- Rapaport SA,
- Bollum FJ
- ↵
- Brunelle A,
- Schleif RF
- ↵
- ↵
- Bi CP,
- Benham CJ
- ↵
- ↵
- Phair RD,
- et al.
- ↵
- Sheridan SD,
- Benham CJ,
- Hatfield GW