New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Interactions between RNA polymerase and the core recognition element are a determinant of transcription start site selection
Edited by Lucia B. Rothman-Denes, The University of Chicago, Chicago, IL, and approved April 7, 2016 (received for review February 26, 2016)

Significance
For all cellular RNA polymerases, the position of the transcription start site (TSS) relative to core promoter elements is variable. Furthermore, environmental conditions and regulatory factors that affect TSS selection have profound effects on levels of gene expression. Thus, identifying determinants of TSS selection is important for understanding gene expression control. Here we identify a previously undocumented determinant for TSS selection by Escherichia coli RNA polymerase. We show that sequence-specific protein–DNA interactions between RNA polymerase core enzyme and a sequence element in unwound promoter DNA, the core recognition element, modulate TSS selection.
Abstract
During transcription initiation, RNA polymerase (RNAP) holoenzyme unwinds ∼13 bp of promoter DNA, forming an RNAP-promoter open complex (RPo) containing a single-stranded transcription bubble, and selects a template-strand nucleotide to serve as the transcription start site (TSS). In RPo, RNAP core enzyme makes sequence-specific protein–DNA interactions with the downstream part of the nontemplate strand of the transcription bubble (“core recognition element,” CRE). Here, we investigated whether sequence-specific RNAP–CRE interactions affect TSS selection. To do this, we used two next-generation sequencing-based approaches to compare the TSS profile of WT RNAP to that of an RNAP derivative defective in sequence-specific RNAP–CRE interactions. First, using massively systematic transcript end readout, MASTER, we assessed effects of RNAP–CRE interactions on TSS selection in vitro and in vivo for a library of 47 (∼16,000) consensus promoters containing different TSS region sequences, and we observed that the TSS profile of the RNAP derivative defective in RNAP–CRE interactions differed from that of WT RNAP, in a manner that correlated with the presence of consensus CRE sequences in the TSS region. Second, using 5′ merodiploid native-elongating-transcript sequencing, 5′ mNET-seq, we assessed effects of RNAP–CRE interactions at natural promoters in Escherichia coli, and we identified 39 promoters at which RNAP–CRE interactions determine TSS selection. Our findings establish RNAP–CRE interactions are a functional determinant of TSS selection. We propose that RNAP–CRE interactions modulate the position of the downstream end of the transcription bubble in RPo, and thereby modulate TSS selection, which involves transcription bubble expansion or transcription bubble contraction (scrunching or antiscrunching).
- RNA polymerase
- transcription start site selection
- promoter
- transcription bubble
- transcription initiation
Transcription initiation consists of a number of biochemical steps leading to formation of a phosphodiester bond between a nucleoside triphosphate (NTP) bound in the RNA polymerase (RNAP) active-center initiating NTP binding site (i site) and an NTP bound in the RNAP active-center extending NTP binding site (i+1 site) (1⇓–3). For bacterial RNAP, promoter-specific initiation requires the RNAP core enzyme (subunit composition α2ββ'ω) to associate with a σ factor forming the RNAP holoenzyme (subunit composition α2ββ'ωσ). The σ factor contains determinants for sequence-specific protein–DNA interactions with four core promoter elements: the −35 element, the extended −10 element, the −10 element, and the discriminator element (4).
During transcription initiation, RNAP holoenzyme unwinds promoter DNA to form an RNAP-promoter open complex (RPo) containing an unwound, single-stranded “transcription bubble.” The process of promoter unwinding begins within the promoter −10 element and propagates downstream, enabling single-stranded nucleotides at the downstream end of the transcription bubble template strand to occupy the RNAP active center i and i+1 sites (Fig. 1A) (1⇓–3). In particular, in RPo, the second-most downstream nucleotide of the transcription bubble template strand occupies the active center i site and serves as the transcription start site (TSS), and the downstream-most nucleotide of the transcription bubble template strand occupies the active center i+1 site. We designate the template-strand nucleotide at the TSS position as TSST (Fig. 1, base in pink) and the template-strand nucleotide at the next base pair as TSS+1T (Fig. 1, base in red).
Analysis of effects of sequence-specific RNAP–CRE interactions by MASTER (11). (A) RPo for TSS at position 7. Gray, RNAP; yellow, σ; blue boxes, −10 element nucleotides; purple boxes, discriminator nucleotides; black boxes, DNA nucleotides (non–template-strand nucleotides above template-strand nucleotides; nucleotides downstream of −10 element numbered); pink box, TSST; red box, TSS+1T; i and i+1, RNAP active-center initiating NTP binding site and extending NTP binding site; red “G,” GCRE. (B, Upper) DNA fragment carrying the MASTER template library lacCONS-N7. Promoter −35 and −10 elements are indicated. Randomized nucleotides are green and 15-nt barcode sequence in the transcribed region is yellow. (Lower Right) 5′ RNA-seq analysis of RNA products generated from the MASTER-N7 template library in vitro. The sequence of the barcode is used to assign the RNA product to an N7 region and the sequence of the 5′ end is used to define the TSS. (Lower Left) Structural organization of downstream end of transcription bubble in RPo for promoter containing GCRE formed with WT RNAP (Upper) or RNAP derivative carrying the βD446A substitution (Lower). Black “βD446,” RNAP β-subunit residue that makes sequence-specific favorable interaction with GCRE. Black “βA446,” RNAP β-subunit residue in mutant RNAP defective in sequence-specific interaction with GCRE. Other rendering and colors as in A.
The position of the TSS relative to the position of the promoter −10 element is variable (5⇓⇓⇓⇓⇓–11). TSS selection preferentially occurs at the position 7-bp downstream of the promoter −10 element, but can occur over a range of at least five positions, encompassing the positions 6-, 7-, 8-, 9-, or 10-bp downstream of the promoter −10 element. Thus, there must be flexibility in the structure of RPo that enables the position of the TSS to vary relative to the position of the −10 element. We previously have proposed that variability in TSS selection is mediated by variability in the size of the unwound transcription bubble (Fig. S1A) (11⇓–13). According to this model, RPo generally contains a 13-bp unwound transcription bubble that places the template-strand nucleotide 7-bp downstream of the −10 element in the i site and places the template-strand nucleotide 8-bp downstream of the −10 element in the i+1 site (Fig. 1A and Fig. S1A) (TSS = 7). For TSS selection to occur at positions further downstream, the downstream DNA duplex is unwound, the unwound DNA is pulled into and past the RNAP active center, and the unwound DNA is accommodated as single-stranded DNA bulges within the transcription bubble, yielding a “scrunched” complex (Fig. S1A) (TSS = 8 and TSS = 9). For TSS selection to occur at positions further upstream, the opposite occurs: downstream DNA is rewound, downstream DNA is extruded from the RNAP active center, and the extrusion of DNA from the RNAP active center is accommodated by stretching DNA within the transcription bubble, yielding an “antiscrunched” complex (Fig. S1A) (TSS = 6). According to this model, any protein–DNA or protein–protein interaction that affects the energy landscape for transcription bubble expansion or contraction (scrunching or antiscrunching) in RPo potentially could modulate TSS selection (13, 14).
Model for TSS selection and hypothesis for effects of RNAP–CRE interactions on TSS selection. (A) Model for TSS selection: changes in TSS selection result from changes in DNA scrunching and antiscrunching in RPo (11⇓–13). TSS = 6 through TSS = 9, RPo engaged in TSS selection at positions 6 through 9 nt downstream of −10 element; gray, RNAP; yellow, σ; blue boxes, −10 element nucleotides; purple boxes, discriminator nucleotides; black boxes, DNA nucleotides (non–template-strand nucleotides above template-strand nucleotides; nucleotides downstream of −10 element numbered); pink box, TSST; red box, TSS+1T; i and i+1, RNAP active-center initiating NTP binding site and extending NTP binding site. Scrunching is indicated by bulged-out nucleotides. Antiscrunching is indicated as a stretched nucleotide–nucleotide linkage. Exact positions and conformations of scrunched nucleotides of the nontemplate and template DNA strands remain to be determined. (B) Hypothesis for effects of RNAP–CRE interactions on TSS selection: RNAP–CRE interactions modulate TSS selection by modulating the extent of scrunching and antiscrunching in RPo. TSS = 6 through TSS = 9, RPo engaged in TSS selection at positions 6 through 9 nt downstream of −10 element at promoters that contain GCRE (i.e., promoters that contain G at the non–template-strand position opposite TSS+1T). Red “G,” GCRE. Black “βD446,” RNAP β-subunit residue that makes sequence-specific favorable interaction with GCRE. Other rendering and colors as in A.
In the structure of RPo, the RNAP core makes direct protein–DNA interactions with the non–template-strand DNA segment at the downstream part of the transcription bubble (15); this DNA segment has been designated the “core recognition element” (CRE; Fig. 1A) (15). RNAP–CRE interactions with the non–template-strand nucleotide at the extreme downstream end of the transcription bubble (i.e., TSS+1NT) are sequence specific, with preference for the base G (GCRE) (Fig. 1, red G) (15).
It has been proposed that sequence-specific RNAP–GCRE interactions facilitate promoter unwinding to form the transcription bubble, stabilize the unwound transcription bubble, and define the downstream end of the transcription bubble (15). According to this proposal, sequence-specific RNAP–GCRE interactions should affect the energy landscape for transcription bubble expansion or contraction (scrunching or antiscrunching) in RPo and therefore potentially could affect TSS selection (Fig. S1B). Here we tested the proposal that sequence-specific RNAP–GCRE interactions affect TSS selection. To do this, we used high-throughput sequencing–based approaches to compare TSS selection by WT RNAP to TSS selection by a mutant RNAP defective in sequence-specific RNAP–GCRE interactions. Our results demonstrate that sequence-specific RNAP–CRE interactions are a determinant of TSS selection.
Results
Sequence-Specific RNAP–CRE Interactions Are a Determinant of TSS Selection in Vitro.
In crystal structures of RNAP–promoter open complexes, residue D446 of the RNAP β subunit makes direct H-bonded interactions with Watson–Crick H-bond–forming atoms of G at GCRE (15). The interactions by βD446 determine specificity at GCRE. Thus, substitution of βD446 by alanine eliminates the ability of RNAP to distinguish A, G, C, and T at the GCRE position (16). Accordingly, an RNAP derivative carrying the βD446A substitution can serve as a reagent to assess the functional significance of sequence-specific RNAP-GCRE interactions (Fig. 1B, Lower Left).
To define the contribution of sequence-specific RNAP–GCRE interactions to TSS selection, we used a high-throughput sequencing–based methodology termed massively systematic transcript end readout (MASTER) (11). MASTER entails the construction of a template library that contains up to 410 (∼1,000,000) bar-coded sequences, production of RNA transcripts from the template library in vitro or in vivo, and analysis of transcript ends using high-throughput sequencing (11, 13).
To analyze the effect of disrupting sequence-specific RNAP–GCRE interactions on TSS selection, we used a MASTER template library, lacCONS-N7, that contained 47 (∼16,000) sequence variants at positions 4–10 bp downstream of the −10 element of a consensus Escherichia coli σ70-dependent promoter (Fig. 1B, Upper) (11). We performed in vitro transcription experiments with the lacCONS-N7 template library, using, in parallel, WT RNAP (RNAP-βWT) or the RNAP derivative containing the βD446A substitution (RNAP-βD446A). RNA products generated in the transcription reactions were isolated and analyzed using high-throughput sequencing of RNA barcodes and 5′ ends (5′ RNA-seq) to define, for each RNA product, the template that produced the RNA and the TSS position (Fig. 1B, Lower Right). For each sequence variant, we calculated the percentage of reads starting at each position within the randomized TSS region, %TSSY = 100 × (no. reads starting at position Y/total no. reads starting at positions 4–10).
To determine the effect of disrupting RNAP–GCRE interactions on TSS selection, we considered TSS positions where TSS+1NT is included within the randomized region of the MASTER template library (i.e., TSS positions 6, 7, 8, and 9). We first calculated %TSS values for each of these positions on the basis of the identity of TSS+1NT. Thus, for each TSS position, we averaged the %TSS values for the ∼4,000 templates having A at TSS+1NT, the ∼4,000 templates having C at TSS+1NT, the ∼4,000 templates having G at TSS+1NT, and the ∼4,000 templates having T at TSS+1NT. Next, we calculated the difference in these %TSS values for reactions performed with RNAP-βWT vs. reactions performed with RNAP-βD446A. We observed that, for all four tested TSS positions (positions 6, 7, 8, and 9), the βD446A substitution decreased the %TSS when TSS+1NT was G (1.3–7.3% decreases; Fig. 2A, top row of table). In contrast, for three of the four tested TSS positions (positions 6, 7, and 8), the βD446A substitution did not decrease the %TSS when TSS+1NT was A, C, or T, and, for the fourth position (position 9), the βD446A substitution did not decrease the %TSS, or decreased the %TSS by smaller amounts, when TSS+1NT was A, C, or T (Fig. 2A, bottom three rows of table).
Effects of disrupting RNAP–GCRE interactions in vitro: analysis by MASTER. (A) Effect of sequence at TSS+1NT on %TSS for RNAP-βWT vs. RNAP-βD446A. Table lists the difference in %TSS (%TSS for RNAP-βWT − %TSS for RNAP-βD446A) at positions 6, 7, 8, or 9 for TSS-regions carrying G, A, C, or T at TSS+1NT. (B) Sequence preferences for TSS+1NT. Sequence logo (33) for TSS+1NT of above-threshold TSS (Left) and TSS that exhibited a large, ≥20%, reduction in %TSS in reactions performed with RNAP-βD446A vs. reactions performed with RNAP-βWT (Right).
We identified 1,230 TSS positions (5.6% of the 21,872 above-threshold TSS positions located 6-, 7-, 8-, or 9-bp downstream of the −10 element) that exhibited large, ≥20%, reductions in %TSS in reactions performed with RNAP-βD446A vs. reactions performed with RNAP-βWT. For these 1,230 TSS positions with large, ≥20%, CRE effects, ∼90% contained G at TSS+1NT (Fig. 2B, top row, Right), whereas, for the total sample of 21,872 TSS positions, there were no detectable sequence preferences at position TSS+1NT (Fig. 2B, top row, Left). Enrichment of G at TSS+1NT for TSS position with large, ≥20%, CRE effects was observed for TSS positions located 6-, 7-, 8-, or 9-bp downstream of the −10 element (TSS = 6, 7, 8, or 9) (Fig. 2B, bottom four rows). In summary, the overwhelming majority of TSS positions that exhibit large, ≥20%, CRE effects have G at TSS+1NT.
To validate the MASTER results, we performed further analyses of two TSS region sequences that exhibited large, ≥20%, CRE effects, contained a TSS at the most common position (position 7), and contained G at TSS+1NT (position 8) (Fig. 3). For each of these two TSS region sequences, we prepared templates containing G, A, C, or T at position TSS+1NT, performed in vitro transcription experiments with RNAP-βWT or RNAP-βD446A, and analyzed RNA products by primer extension. For each of the two sets of constructs, the primer-extension results matched the MASTER results. A large, ∼30%, CRE effect was observed when TSS+1NT was G but not when TSS+1NT was A, C, or T (Fig. 3).
Effects of disrupting RNAP–GCRE interactions in vitro: analysis by primer extension. (Left) Primer-extension results. RNA products were generated in reactions performed with RNAP-βWT or RNAP-βD446A and placCONS templates carrying TSS region sequences (in green) of AACGNCA (A) or CGCTNAT (B), where N is G, A, C, or T. Bands corresponding to a TSS at position 7 are indicated. (Right) Table lists the difference in %TSS (%TSS for reactions with RNAP-βWT - %TSS for reactions with RNAP-βD446A) at position 7 for templates carrying a G, A, C, or T at position 8 calculated by primer extension or calculated by MASTER.
The results in Figs. 2 and 3 establish that disrupting sequence-specific RNAP–GCRE interactions affects TSS selection in vitro in a manner that correlates with the presence and position of GCRE in the TSS region. We conclude that sequence-specific RNAP–CRE interactions are a determinant of TSS selection in vitro.
Sequence-Specific RNAP–CRE Interactions Are a Determinant of TSS Selection in Vivo.
Analysis of 47 (∼16,000) consensus promoter derivatives.
To define the contribution of sequence-specific RNAP–GCRE interactions to TSS selection in vivo, we used merodiploid native-elongating transcript sequencing (mNET-seq) (16). mNET-seq involves selective analysis of transcripts associated with an epitope-tagged RNAP in the presence of a mixed population of epitope-tagged RNAP and untagged RNAP (Fig. 4A). In prior work, we used mNET-seq to determine the effect of sequence-specific RNAP–GCRE interactions on pausing during elongation (16). In this work, we used a variant of mNET-seq, 5′ mNET-seq, to determine the effect of sequence-specific RNAP–GCRE interactions on TSS selection (Fig. 4A). To do this, we introduced into cells a plasmid encoding 3xFLAG-tagged βWT or 3× FLAG-tagged βD446A, isolated RNA products associated with RNAP-βWT or RNAP-βD446A by immunoprecipitation, converted RNA 5′ ends to cDNAs, and performed high-throughput sequencing (Fig. 4A).
Effects of disrupting RNAP–GCRE interactions in vivo: 5′ mNET-seq analysis of 47 (∼16,000) consensus promoter derivatives. (A) Steps in 5′ mNET-seq analysis of TSS selection from plasmid-borne MASTER template library: (Top) RNAP derivatives in cells (the blue RNAP derivative with asterisk is RNAP-βD446A); (Middle) RNAPs on the same MASTER template in four cells (RNA products in blue are associated with RNAP-βD446A); and (Bottom) isolation of RNA products after immunoprecipitation with anti-FLAG affinity gel and sequencing analysis of RNA 5′ ends. In this example, TSS selection at the T in the middle of the randomized TSS region is decreased with the mutant RNAP derivative. (B) Effect of sequence at TSS+1NT on %TSS for RNAP-βWT vs. RNAP-βD446A. Table lists the difference in %TSS (%TSS for RNAP-βWT − %TSS for RNAP-βD446A) at positions 6, 7, 8, or 9 for TSS-regions carrying G, A, C, or T at TSS+1NT. (C) Sequence preferences for TSS+1NT. Sequence logo (33) for TSS+1NT of above-threshold TSS positions located 6–9 bp downstream of the −10 element (Left) and TSS positions located 6–9 bp downstream of the −10 element that exhibited a large, ≥20%, reduction in %TSS in 5′ mNET-seq analysis of RNAP-βD446A vs. 5′ mNET-seq analysis of RNAP-βWT (Right).
To enable direct comparison of in vivo and in vitro results, we performed 5′ mNET-seq using the same MASTER template library of 47 (∼16,000) consensus core promoter derivatives that we used for in vitro analysis. The results of MASTER in vivo (Fig. 4 B and C) matched the results of MASTER in vitro (Fig. 2). For all four tested TSS positions (positions 6, 7, 8, and 9), the βD446A substitution decreased the %TSS when TSS+1NT was G (0.6–7.3% decreases) (Fig. 4B, top row of table). In contrast, for three of the four tested TSS positions (positions 6, 7, and 8), the βD446A substitution did not decrease the %TSS when TSS+1NT was A, C, or T, and, for the fourth position (position 9), the βD446A substitution decreased the %TSS by smaller amounts when TSS+1NT was A, C, or T (Fig. 4B, bottom three rows of table). Furthermore, we identified 860 TSS positions (4.3% of the 20,217 above-threshold TSS positions located 6-, 7-, 8-, or 9-bp downstream of the −10 element) with large, ≥20%, CRE effects. For these 860 TSS positions with large, ≥20%, CRE effects, ∼80% contained G at TSS+1NT (Fig. 4C, Right), whereas, for the total sample of 20,217 TSS positions, there were no detectable sequence preferences at position TSS+1NT (Fig. 4C, Left).
The results establish that disrupting sequence-specific RNAP–GCRE interactions affects TSS selection in vivo in a manner that correlates with the presence and position of GCRE in the TSS region. We conclude that sequence-specific RNAP–CRE interactions are a determinant of TSS selection in vivo.
Analysis of E. coli transcriptome.
Having shown by MASTER that sequence-specific RNAP–CRE interactions are a determinant of TSS selection in the context of a consensus core promoter in vivo, we next assessed the contribution of sequence-specific RNAP–CRE interactions to TSS selection in the context of natural promoters in vivo in E. coli. (The primers used in the in vivo MASTER analysis by 5′ mNET-seq shown in Fig. 4 provided information only about transcripts from the synthetic consensus promoter derivatives. This is because the primers used for synthesis of the first cDNA strand annealed only to transcripts produced from the synthetic consensus promoter derivatives. A separate experiment, with primers that enable generation of cDNAs from transcripts produced from natural E. coli promoters, was necessary to provide information about transcripts from natural E. coli promoters. Therefore, to analyze transcripts from natural E. coli promoters, the primers used for synthesis of the first cDNA strand carried nine randomized nucleotides at the 3′ end.)
Using data from experiments performed with RNAP-βWT, we identified 1,500 above-threshold TSS positions associated with natural promoters in E. coli. Of these 1,500 TSS positions, we identified 44 TSS positions that exhibited large, ≥20%, CRE effects (Table S1); 39 of these 44 (∼90%) contained G at TSS+1NT (Fig. 5B, Right, and Table S1), whereas for the total sample of 1,500 above-threshold TSS, there were no detectable sequence preferences at TSS+1NT (Fig. 5B, Left).
Effects of disrupting RNAP-GCRE interactions in vivo: 5′ mNET-seq analysis of E. coli transcriptome. (A) Steps in 5′ mNET-seq analysis of natural promoters: (Top) RNAP derivatives in cells (the blue RNAP derivative with asterisk is RNAP-βD446A); (Middle) RNAPs on the same transcription unit in four cells (RNA products in blue are associated with RNAP-βD446A); and (Bottom) isolation of RNA products after immunoprecipitation with anti-FLAG affinity gel and sequencing analysis of RNA 5′ ends. In this example, TSS selection at genome coordinate labeled “a” is decreased with the mutant RNAP derivative. (B) Sequence preferences for TSS+1NT. Sequence logo (33) for TSS+1NT of above-threshold TSS associated with natural promoters (Left) and TSS associated with natural promoters that exhibited a large, ≥20%, reduction in %TSS in 5′ mNET-seq analysis of RNAP-βD446A vs. RNAP-βWT (Table S1). (C) Primer-extension analysis of TSS selection in vitro from natural promoters. RNA products were generated in reactions performed with RNAP-βWT or RNAP-βD446A and templates carrying PsecE (Left) or PhemC (Right). The sequence of each promoter, including the −10 element and 12 downstream bp, is provided. In the case of PsecE, bands corresponding to a TSS at A7 or G8 are indicated. In the case of PhemC, bands corresponding to a TSS at A6 or G8 are indicated. Base in red is GCRE associated with the TSS at A7 of PsecE or with the TSS at G8 of PhemC.
TSS positions in natural promoters in E. coli that exhibited large, ≥20%, CRE effects
To validate the 5′ mNET-seq results, we performed primer-extension experiments with two E. coli promoters that contained a TSS that exhibited a large, ≥20%, CRE effect and contained G at TSS+1NT: PsecE and PhemC (Table S1). We generated linear templates carrying PsecE or PhemC, performed in vitro transcription assays using RNAP-βWT or RNAP-βD446A, and analyzed TSS selection by primer extension (Fig. 5C). For each promoter, two prominent start sites were observed in reactions with RNAP-βWT. In the case of PsecE, ∼60% of the transcripts started at an A located 7-bp downstream of the predicted −10 element (A7) and ∼40% of the transcripts started at a G located 8-bp downstream (G8) (Fig. 5C, Left). In the case of PhemC, ∼30% of the transcripts started at an A located 6-bp downstream of the predicted −10 element (A6) and ∼70% of the transcripts started at a G located 8-bp downstream (G8) (Fig. 5C, Right). For each promoter, the percentage of transcripts starting at the position that contained G at TSS+1NT (A7 for PsecE and G8 for PhemC) was reduced by ∼30% when reactions were performed with RNAP-βD446A (Fig. 5C), consistent with results of 5′ mNET-seq (Table S1). We conclude that sequence-specific RNAP–CRE interactions are a determinant of TSS selection in natural promoters in the E. coli genome.
Discussion
Sequence-Specific RNAP–CRE Interactions in TSS Selection.
Here we show that sequence-specific interactions between RNAP and the downstream segment of the nontemplate strand of the transcription bubble (CRE) are a determinant of TSS selection. In particular, using high-throughput sequencing–based approaches, we define a role of sequence-specific recognition of a G at the most downstream position of the CRE (GCRE) during TSS selection in the context of a library of 47 (∼16,000) TSS region sequences of a consensus core promoter in vitro and in vivo (Figs. 2–4) and in the context of natural promoters in E. coli in vivo (Fig. 5 and Table S1).
As discussed above, variability in TSS selection is believed to involve transcription bubble expansion or contraction (scrunching or antiscrunching) in RPo (Fig. S1A) (11⇓⇓–14). We propose that the observed effects of sequence-specific RNAP–CRE interactions on TSS selection occur by influencing transcription bubble expansion or contraction (scrunching or antiscrunching) in RPo (Fig. S1B). Specifically, we propose that sequence-specific RNAP–CRE interactions favor TSS selection at sequences that contain G at TSS+1NT. According to this proposal, the role of sequence-specific RNAP–CRE interactions in defining the downstream edge of the transcription bubble concurrently defines the extent of transcription bubble expansion or contraction (scrunching or antiscrunching) in RPo and therefore modulates TSS selection (Fig. S1B).
The results of this work, together with results of previous work, establish that TSS selection involves at least four promoter sequence determinants: (i) position relative to the −10 element (preference for the position 7-bp downstream of the −10 element) (5⇓⇓⇓⇓⇓–11); (ii) sequence of TSST and TSS-1T (strong preference for pyrimidine at TSST and preference for purine at TSS-1T, which enable initiation with a purine NTP and maximize stacking between DNA bases and the initiating purine NTP) (11, 17⇓⇓–20); (iii) sequence of the discriminator element (preference for TSS selection at upstream positions for discriminator sequences that disfavor scrunching and preference for TSS selection at downstream positions for discriminator sequences that favor scrunching) (13, 14); and (iv) sequence of the CRE (preference for G at TSS+1NT). In addition to these sequence determinants, DNA topology and NTP concentrations also influence TSS selection (6, 8, 9, 11, 21⇓⇓⇓⇓–26). Thus, TSS selection is a multifactorial process, in which the ultimate outcome for a given promoter reflects the contributions of multiple promoter sequence determinants and multiple reaction conditions. Because sequence-specific RNAP–CRE interactions are only one of several determinants of TSS selection, their quantitative significance at different promoters differs. At some promoters, such as PsecE and PhemC, sequence-specific RNAP–CRE interactions have quantitatively large, ≥20%, effects on TSS selection (Fig. 5C and Table S1), whereas at other promoters, the quantitative effects of RNAP–CRE interactions are smaller.
Prospect.
In prior work, we showed that sequence-specific RNAP–CRE interactions affect RPo formation during transcription initiation, RPo stability during transcription initiation, translocational bias during transcription elongation, and sequence-specific pausing during transcription elongation (15, 16). Accordingly, our findings that sequence-specific RNAP–CRE interactions are a determinant of TSS selection add to an emerging view that sequence-specific RNAP–CRE interactions play functionally important roles during all stages of transcription that involve an unwound transcription bubble. A priority for future work will be to assess the roles of sequence-specific RNAP–CRE interactions in other steps of transcription that involve an unwound transcription bubble (e.g., transcriptional slippage, initial transcription, promoter escape, factor-dependent pausing, and termination). Another priority for future work will be to assess possible roles of sequence-specific RNAP–CRE interactions in eukaryotic transcription, noting that RNAP residues involved in sequence-specific RNAP–CRE interactions are conserved in bacteria and eukaryotes.
Materials and Methods
Details for all procedures are in the SI Materials and Methods.
Plasmids and Oligonucleotides.
Plasmids are listed in Table S2. Oligonucleotides are listed in Table S3.
Plasmids used in this study
Oligonucleotides used in this study
Proteins.
RNAP-βWT holoenzyme and RNAP-βD446 holoenzyme were prepared from E. coli strain XE54 (27) transformed with plasmids pRL706 or pRL706-βD446A, respectively, using procedures described in ref. 28.
In Vitro Transcription Assays.
For MASTER experiments shown in Fig. 2, single round in vitro transcription assays were performed essentially as described in ref. 11 using a linear DNA template containing the placCONS-N7 library (Fig. 1B, Upper). RNA products were purified and TSS selection was analyzed by 5′ RNA-seq as described in ref. 11 (see Table S4 for list of samples). In vitro transcription assays shown in Figs. 3 and 5C were performed essentially as described in ref. 29. RNA products generated in these reactions were analyzed by primer extension as described in ref. 29.
Samples for high-throughput sequencing
5′ mNET-seq.
For the in vivo MASTER experiments shown in Fig. 4, E. coli DH10B-T1R cells (Life Technologies) containing plasmids pRL706-βWT;3xFLAG or pRL706-βD446A;3xFLAG were transformed with ∼50 ng pMASTER-lacCONS-N7 library to obtain a 25-mL overnight culture representing cells derived from at least 20 million unique transformants; 0.5 mL of the overnight cell culture was used to inoculate 50 mL LB media containing 100 μg/μL carbenicillin and 25 μg/μL chloramphenicol. When the cell density reached an OD600 ∼0.3, 1 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) was added, and cells were grown for an additional 2 h. RNA associated with RNAP was isolated using procedures described in ref. 16.
For the experiments shown in Fig. 5, MG1655 cells containing plasmids pRL706-βWT;3xFLAG or pRL706-βD446A;3xFLAG were shaken at 220 rpm at 37 °C in 100 mL 4× LB (40 g Bacto tryptone, 20 g Bacto yeast extract, and 10 g NaCl per liter) containing 200 µg/µL carbenicillin in 500-mL DeLong flasks (Bellco). When cell density reached an OD600 ∼0.6, 1 mM IPTG was added, and cells were grown for an additional 4 h. RNA associated with RNAP was isolated using procedures described in ref. 16.
RNA products associated with RNAP were analyzed by 5′ RNA-seq using procedures described in ref. 30 (see Table S4 for list of samples).
In Vitro and in Vivo MASTER Data Analysis.
Analysis of 5′ RNA-seq data obtained from MASTER experiments was performed essentially as described in ref. 11. Sequencing of template DNA was used to associate the 7-bp randomized TSS region sequence with a corresponding second 15-bp randomized sequence that serves as its barcode. Reads that contained a perfect match to the DNA template from which they were derived were used for the analysis of TSS selection. The percentage of reads starting at a given TSS position (%TSS) was calculated using the following formula: %TSSY = 100 × (no. reads starting at position Y/total no. reads starting at positions 4–10). Above-threshold TSS positions were those for which the %TSS value was ≥20%.
5′ mNET-seq Analysis of Natural Promoters in Vivo in E. coli.
Identification of TSS positions and TSS regions for natural promoters in E. coli was done essentially as described in ref. 31. The first six bases of each read were trimmed (to remove sequences introduced during the cDNA library construction procedure), and the next 30 bases were aligned to the E. coli reference genome (NC_000913.3) using Bowtie (32). Among these reads, we used those that aligned to a unique position in the genome with zero mismatches for the analysis of TSS selection.
Using data derived from the analysis of RNA products associated with RNAP-βWT, we defined a list of primary TSS positions that met the following two criteria: (i) the read count at the coordinate was above a threshold value (≥50 reads) and (ii) the read count at the coordinate represented a local maximum in an 11-bp window centered on the coordinate. For each primary TSS position, we designated the positions spanning 5-bp upstream to 5-bp downstream as a TSS region. Next, for each TSS region, we calculated the percentage of reads starting at each of the 11 positions: %TSSY = 100 × (no. reads starting at position Y/total no. reads starting within the TSS region). We identified 1,500 TSS positions within TSS regions with an above-threshold value of %TSS (≥20%). For each of these 1,500 TSS positions, we calculated the difference between the average %TSS observed in experiments performed with RNAP-βWT and that observed in experiments performed with RNAP-βD446A. TSS positions for which this difference was ≥20% are listed in Table S1.
SI Materials and Methods
Analysis of TSS Selection in Vitro by MASTER.
Preparation of template DNA.
pMASTER-lacCONS-N7 plasmid DNA was diluted to ∼109 molecules/μL. One microliter of diluted DNA was amplified by emulsion PCR using a Micellula DNA Emulsion and Purification Kit (Chimerx) in detergent-free Phusion HF reaction buffer containing 5 μg/mL BSA, 0.4 mM dNTPs, 0.5 μM Illumina RP1 primer, 0.5 μM Illumina RPI1 primer, and 0.04 U/μL Phusion HF polymerase (Thermo Scientific). Emulsion PCR reactions were performed with an initial denaturation step of 10 s at 95 °C, amplification for 30 cycles (denaturation for 5 s at 95 °C, annealing for 5 s at 60 °C, and extension for 15 s at 72 °C), and a final extension for 5 min at 72 °C. The emulsion was broken, and DNA was purified according to the manufacturer’s recommendations. DNA was recovered by ethanol precipitation and resuspended in 30 μL nuclease-free water.
Transcription reactions.
In vitro transcription assays were performed by mixing 10 nM template DNA with 50 nM RNAP-βWT holoenzyme or 50 nM RNAP-βD446A holoenzyme in transcription buffer [50 mM Tris⋅HCl (pH 8.0), 10 mM MgCl2, 0.01 mg/mL BSA, 100 mM KCl, 5% (vol/vol) glycerol, 10 mM DTT, and 0.4U/μL RNase OUT]. RNAP-promoter open complexes were allowed to form by incubation at 37 °C for 10 min. A single round of transcription was initiated by addition of a mixture of NTPs to a final concentration of 1 mM and heparin to a final concentration of 0.1 mg/mL. After 15 min, reactions were stopped by addition of EDTA (pH 8) to a final concentration of 10 mM. Nucleic acids were recovered by ethanol precipitation and resuspended in 30 μL nuclease-free water.
Purification of RNA products.
Nucleic acids recovered from the ethanol precipitation were treated with 2 U TURBO DNase (Life Technologies) at 37 °C for 1 h, mixed with an equal volume of 2× RNA loading dye [95% (vol/vol) deionized formamide, 18 mM EDTA, 0.25% (wt/vol) SDS, xylene cyanol, bromophenol blue, and amaranth], and separated by electrophoresis on 10% (wt/vol) acrylamide, 7 M urea slab gels (equilibrated and run in 1× TBE). The gel was stained with SYBR Gold nucleic acid gel stain (Life Technologies), bands were visualized on a UV transilluminator, and RNA transcripts ∼100 nt in size were excised from the gel. The excised gel slice was crushed and incubated in 300 μL 0.3 M NaCl in 1× TE buffer at 70 °C for 10 min. Eluted RNAs were separated from crushed gel fragments using a Spin-X column (Corning). After the first elution, the crushed gel fragments were collected; the elution procedure was repeated; and nucleic acids were collected, pooled with the first elution, isolated by ethanol precipitation, and resuspended in 25 μL RNase-free water. Purified RNA products were analyzed by 5′ RNA-seq using the procedure described in the next section.
5′ RNA-seq.
Before cDNA library construction 5′ triphosphate RNA products were converted to 5′ monophosphate RNA. To do this, ∼100 ng purified RNA was treated with 20 U 5′-RNA polyphosphatase (New England Biolabs). Samples were extracted with acid phenol:chloroform (pH 4.5). RNA products were recovered by ethanol precipitation and resuspended in 10 μL RNase-free water.
Ligation of adaptor to 5′ end of RNA products.
RNA products were combined with PEG 8000 [10% (wt/vol) final concentration], oligo s1206 (1 pmol/μL final concentration), ATP (1 mM final concentration), 40 U RNase OUT, 1× T4 RNA ligase 1 reaction buffer (New England Biolabs), and 10 U of T4 RNA ligase 1 (New England Biolabs) in a total volume of 30 μL. The mixture was incubated at 16 °C for 16 h.
Size selection of adaptor-ligated RNA products.
Adaptor-ligated RNA products were mixed with an equal volume of 2× RNA loading dye and separated by electrophoresis on 10% (wt/vol) acrylamide, 7 M urea slab gels (equilibrated and run in 1× TBE). The gel was stained with SYBR Gold nucleic acid gel stain, bands were visualized with UV transillumination, and species ranging from ∼80 to ∼300 nt were excised from the gel. RNA products were eluted from the gel using the procedure described above, isolated by ethanol precipitation, and resuspended in 10 μL nuclease-free water.
cDNA synthesis.
Ten microliters of gel-eluted RNA products was mixed with 0.3 μL s128 oligonucleotide (100 pmol/μL), incubated at 65 °C for 5 min, and cooled to 4 °C; 9.7 μL of a mixture containing 4 μL 5× First-Strand buffer (Life Technologies), 1 μL 10 mM dNTP mix, 1 μL 100 mM DTT, 1 μL (40 U) RNase OUT, 1 μL (200 U) SuperScript III Reverse Transcriptase (Life Technologies), and 1.7 μL nuclease-free water was added to the RNA/oligonucleotide mixture. The reactions were incubated in a thermal cycler with a heated lid at 25 °C for 5 min, followed by 55 °C for 60 min and 70 °C for 15 min. Reactions were cooled to room temperature, 10 U RNase H (Life Technologies) was added, and the reactions were incubated at 37 °C for 20 min.
Size selection of cDNA products.
An equal volume of 2× RNA loading dye was added, and nucleic acids were separated by electrophoresis on 10% (wt/vol) acrylamide, 7 M urea slab gels (equilibrated and run in 1× TBE). The gel was stained with SYBR gold nucleic acid gel stain, and ∼80 to ∼150 nt species were excised from the gel. cDNA products were recovered from the gel using the procedure described above and resuspended in 10 μL nuclease-free water.
Amplification of cDNA products.
Five microliters of gel-isolated cDNA products were added to a mixture containing 1× Phusion HF reaction buffer, 0.2 mM dNTPs, 0.25 μM Illumina RP1 primer, 0.25 μM Illumina index primer, and 0.02 U/μL Phusion HF polymerase. PCR was performed with an initial denaturation step of 30 s at 98 °C, amplification for 11 cycles (denaturation for 10 s at 98 °C, annealing for 20 s at 62 °C, and extension for 10 s at 72 °C), and a final extension for 5 min at 72 °C.
Purification of cDNA products.
Amplified cDNA products were separated by gel electrophoresis using a nondenaturing 10% (wt/vol) acrylamide slab gel (equilibrated and run in 1× TBE). The gel was stained with SYBR Gold nucleic acid gel stain, and species at ∼170 bp were excised from the gel. cDNA products were eluted from the gel with 600 μL 0.3 M NaCl in 1× TE buffer at 37 °C for 2 h, precipitated, and resuspended in 13 μL nuclease-free water.
High-throughput sequencing.
Libraries were sequenced on an Illumina HiSeq 2500 platform in rapid mode using custom primer s1115.
Data analysis.
Sequencing of template DNA (sample VV891) (Table S4) was used to associate the 7-bp randomized sequence in the region of interest with a corresponding second 15-bp randomized sequence that serves as its barcode. The identity of the 15-bp barcode in each RNA product was used to determine the identity of bases at positions 4–10 of the lacCONS template from which the RNA product was generated. Sequences derived from the RNA 5′ end of reads that were perfect matches to the sequence of the template were used for analysis of TSS selection. Experiments were performed in duplicate (samples VV854 and VV855 for RNAP-βWT and samples VV860 and VV861 for RNAP-βD446A) (Table S4).
Analysis of TSS Selection in Vitro by Primer Extension.
Preparation of template DNA.
Linear DNA templates were generated by PCR using plasmids pHV-S01, pHV-S02, pHV-S03, pHV-S04, pHV-S05, pHV-S06, pHV-S07, pHV-S08, pHV-S17, or pHV-S18 as template and oligonucleotide primer HV121, which contains a 5′ biotin moiety, and oligonucleotide primer HV122.
The biotinylated linear DNA templates generated by PCR were bound to streptavidin-coated paramagnetic beads [Streptavidin MagnaSphere Paramagnetic Particles (SA-PMPs); Promega]. To do this, 100 µL SA-PMP slurry per each DNA template was washed three times with 100 µL binding buffer [10 mM Tris (pH 8), 150 mM NaCl, and 100 µg/mL BSA]. The SA-PMPS were resuspended in 100 µL binding buffer, 2.5 µL 400 nM DNA template stock was added to each slurry, and the mixture was gently mixed for 30 min at 25 °C. The binding buffer was removed, SA-PMPs were washed three times with 1× TB [40 mM Tris (pH 8), 10 mM MgCl2, 50 mM KCl, 10 mM β-mercaptoethanol, 10 µg/mL BSA, and 5% (wt/vol) PEG-8000], and resuspended in 10 µL reaction buffer to obtain 100 nM SA-PMP–conjugated DNA templates stock solutions that were used for the transcription assays.
Transcription reactions.
In vitro transcription assays were performed by mixing 50 nM RNAP with 10 nM template (attached to beads) for 10 min at 37 °C in 1× TB. Transcription was initiated by adding NTPs to a final concentration of 100 µM. The total reaction volume was 20 μL. Reactions were stopped after 10 min by adding 100 μL stop solution [0.5 mg/mL glycogen and 10 mM EDTA (pH 8.0)]. Magnetic beads were pelleted using a MagneSphere Technology Magnetic Separation Stand (Promega), and the supernatant was transferred to a fresh tube and extracted with acid phenol:chloroform. RNA transcripts were recovered by ethanol precipitation and resuspended in 12 μL water.
Primer-extension reactions.
Oligonucleotide primer HV123 was 32P-5′ end-labeled with T4 polynucleotide kinase in a 50-μL reaction containing 120 pmol of primer, 40 U of enzyme, and 100 μCi of γ32P ATP (Perkin Elmer). The labeling reaction was incubated at 37 °C for 1 h followed by an incubation at 95 °C for 10 min. Unincorporated nucleotides and salts were removed by passage over an Illustra G-25 microspin column (GE Healthcare). One microliter of labeled primer was mixed with 5 μL of the RNA recovered from the transcription reactions. This mixture was heated at 90 °C for 2 min and immediately transferred to ice. Reverse transcription was performed by adding 4 µL of a mixture containing 10 U AMV reverse transcriptase (New England Biolabs), AMV buffer, dNTPs (10 mM of each dNTP), and 10 U murine RNase inhibitor (New England Biolabs) to the annealed primer template mixture and incubating at 55 °C for 1 h, 5 min at 95 °C, and then cooled to 4 °C. Reactions were stopped by addition of 10 μL 98% (vol/vol) formamide containing 10 mM EDTA, 0.02% (wt/vol) bromophenol blue, and 0.02% (wt/vol) xylene cyanol. Samples were electrophoresed on an 8% (wt/vol) acrylamide, 7 M urea slab gel (equilibrated and run using a gradient buffer of 1× TBE in the upper reservoir and 1× TBE, 0.3 M NaOAc in the lower reservoir). Radiolabeled species were detected by storage-phosphor imaging. TSS assignments were made by comparison with a sequencing ladder prepared using the same radiolabeled primer used for the extension reactions and a Sequenase Version 2.0 DNA sequencing kit (USB Corporation). Experiments were performed three independent times (one of the independent replicates for each template is shown in Fig. 3 and Fig. 5C). The values for %TSS (RNAP-βWT) − %TSS (RNAP-βD446A) reported in Fig. 3 were derived by averaging the results of the three experiments.
Analysis of TSS Selection in Vivo from 47 (∼16,000) Consensus Promoter Derivatives.
Cell growth.
Escherichia coli DH10B-T1R cells (Life Technologies) containing plasmids pRL706-βWT;3xFLAG or pRL706-βD446A;3xFLAG were transformed with ∼50 ng pMASTER-lacCONS-N7 library to obtain a 25-mL overnight culture representing cells derived from at least 20 million unique transformants; 0.5 mL of the overnight cell culture was used to inoculate 50 mL LB media containing 100 μg/μL carbenicillin and 25 μg/μL chloramphenicol. When the cell density reached an OD600 ∼0.3, 1 mM IPTG was added, and cells were grown for an additional 2 h. Cell suspensions were divided equally among 12 × 2-mL tubes (BioExcell) and centrifuged (1 min, 21,000 × g at room temperature) to collect cells, and supernatants were removed. Cell pellets were then rapidly frozen on dry ice and stored at −80 °C.
pMASTER-lacCONS-N7 plasmid DNA was isolated from these cells using a Plasmid Miniprep kit (Qiagen). Plasmid DNA was used as template in emulsion PCR reactions to generate a product that was sequenced to assign barcodes (see below).
RNA isolation.
Cells pellets derived from 12 mL culture were resuspended in 1 mL lysis buffer (B-Per, Bacterial Protein Extraction Reagent; Thermo Scientific) supplemented with one quarter of a protease inhibitor mixture tablet (complete Mini EDTA-free; Roche), 1 mM EDTA, 80 U Murine RNase Inhibitor (NEB), 100 μg lysozyme (Thermo Scientific), and 150 U DNase I (Thermo Scientific) and incubated for 10 min. The lysate was then clarified by centrifugation (10 min, 21,000 × g), and NaCl was added to a final concentration of 150 mM. The lysate was added to 1 mL anti-FLAG M2 affinity gel (Sigma Aldrich) that had been washed three times with 3 mL 1× TBS and equilibrated in 3 mL wash buffer (B-Per solution containing 150 mM NaCl, 1 mM EDTA, 50 U/mL Murine RNase Inhibitor, and protease inhibitor mixture [complete EDTA-free (Roche); 1 tablet per 50 mL]). The lysate and affinity gel mixture was nutated at 4 °C for 2.5 h in a 1.7-mL centrifuge tube. The mixture was transferred to a 10-mL Econo-Pack disposable chromatography column (Bio-Rad), the flow through was collected, and the affinity gel was washed eight times with 5 mL wash buffer and three times with 250 μL elution buffer (B-Per solution containing 150 mM NaCl, 1 mM EDTA, 50 U/mL Murine RNase Inhibitor, and 2 mg/mL 3× FLAG peptide; GenScript). For the washes with elution buffer, the affinity gel was incubated for 30 min before collection of the fractions. The presence of epitope tagged βWT or βD446A was analyzed in each fraction by immunoblotting.
To isolate the RNA products associated with RNAP, pooled eluates from above were mixed with three volumes of TRI Reagent solution (Molecular Research Center), incubated at 70 °C for 10 min, and centrifuged (10 min, 21,000 × g) to remove insoluble material. The supernatant was transferred to a fresh tube, ethanol was added to a final concentration of 60.5% (vol/vol), and the mixture was applied to a Direct-zol spin column (Zymo Research). DNase I treatment was performed on-column according to the manufacturer’s recommendations. RNA products were eluted from the column with three sequential portions of 30 μL nuclease-free water that had been heated to 70 °C. Before cDNA library construction, RNA products were treated with 4 U TURBO DNase (Ambion) at 37 °C for 1 h. Following DNase treatment, samples were extracted with acid phenol:chloroform, and RNA products were recovered by ethanol precipitation and resuspended in RNase free water.
5′ RNA-seq.
Before cDNA library construction, 5′ monophosphate RNA products were first removed by treatment of 0.75–1.3 μg of RNA with 1 U Terminator 5′-Phosphate-Dependent Exonuclease (Epicentre). Samples were extracted with acid phenol:chloroform, RNA products were recovered by ethanol precipitation and resuspended in RNase-free water. Next, 5′ triphosphate RNA products were converted to 5′ monophosphate RNA products by treating samples with 20 U 5′-RNA polyphosphatase as described in ref. 29. Samples were extracted with acid phenol:chloroform, and RNA products were recovered by ethanol precipitation and resuspended in 10 μL RNase-free water.
5′ RNA-seq analysis was performed as described above.
Data analysis.
In vivo MASTER experiments were performed in triplicate (samples VV871, VV872, and VV873 for RNAP-βWT and samples VV874, VV875, and VV876 for RNAP-βD446A) (Table S4). pMASTER-lacCONS-N7 plasmid DNA isolated from each individual cell culture was used as template in emulsion PCR reactions to generate products that were sequenced to assign barcodes as described in ref. 11. For each RNAP-βWT sample, three emulsion PCR products were generated and sequenced (Table S4). For each RNAP-βD446A sample, one emulsion PCR product was generated and sequenced (Table S4). The identity of the 15-bp barcode in each RNA product was used to determine the identity of bases at positions 4–10 of the lacCONS template from which the RNA product was generated. Sequences derived from the RNA 5′ end of reads that were perfect matches to the sequence of the template were used for analysis of TSS selection.
Analysis of TSS Selection in Natural Promoters in Vivo in E. coli.
Cell growth.
MG1655 cells containing plasmids pRL706-βWT;3xFLAG or pRL706-βD446A;3xFLAG were shaken at 220 rpm at 37 °C in 100 mL 4× LB (40 g Bacto tryptone, 20 g Bacto yeast extract, and 10 g NaCl per liter) containing 200 µg/µL carbenicillin in 500-mL DeLong flasks (Bellco). When cell density reached an OD600 ∼0.6, 1 mM IPTG was added, and cells were grown for an additional 4 h. Cells were harvested and stored as described above.
RNA isolation.
RNA products associated with RNAP were isolated as described above.
5′ RNA-seq.
Before cDNA library construction, enzymatic treatments were performed to first remove 5′ monophosphate RNA products and second convert 5′ triphosphate RNA products to 5′ monophosphate RNA products as described above.
5′ RNA-seq analysis was performed as described above with the following exceptions. In the step, ligation of adaptor to 5′ end of RNA products, primer s1086 was used instead of primer s1206. In the step, size selection of 5′ adaptor-ligated RNA products, all species larger than the 5′ adaptor were excised from the gel instead of ∼80- to ∼300-nt species. In the step, cDNA synthesis, primer s1082 was used instead of s128. In the step, size selection of cDNA products, ∼90- to ∼450-nt cDNA products were isolated instead of ∼80- to ∼150-nt cDNA products. In the step, purification of cDNA products, ∼160- to ∼350-bp species were isolated instead of ∼170-bp species.
Data analysis.
Identification of TSS positions and TSS regions for natural promoters in E. coli was done essentially as described in ref. 31. The first six bases of each read were trimmed (to remove sequences introduced during the cDNA library construction procedure), and the next 30 bases were aligned to the E. coli reference genome (NC_000913.3) using Bowtie (32). Among these reads, we used those that aligned to a unique position in the genome with zero mismatches for the analysis of TSS selection. Using data derived from the analysis of RNA products associated with RNAP-βWT (samples VV631, VV632, VV655, and VV656; Table S4), we defined a list of primary TSS positions that met the following two criteria: (i) the read count at the coordinate was above a threshold value (≥50 reads) and (ii) the read count at the coordinate represented a local maximum in an 11-bp window centered on the coordinate. For each primary TSS position, we designated the positions spanning 5-bp upstream to 5-bp downstream as a TSS region. Next, for each TSS region, we calculated the percentage of reads starting at each of the 11 positions, %TSSY = 100 × (# reads starting at position Y/total # reads starting within the TSS region).
To enable a comparison between data derived from analysis of nascent RNA associated with RNAP-βWT with that derived from analysis of nascent RNA associated with RNAP-βD446A, we identified TSS regions for which we obtained ≥50 total reads starting within the TSS region in each of the eight samples used for the analysis (VV631–VV634 and VV655–VV658; Table S4). Next, we averaged the %TSS values observed for RNAP-βWT (samples VV631, VV632, VV655, and VV656; Table S4) for each position within these TSS regions. We identified 1,500 TSS positions with an above-threshold value of %TSS (≥20%). For each of these 1,500 TSS positions, we calculated the difference between the average %TSS observed in experiments performed with RNAP-βWT (average derived from samples VV631, VV632, VV655, and VV656; Table S4) and that observed in experiments performed with RNAP-βD446A (average derived from samples VV633, VV634, VV657, and VV658; Table S4). Table S1 lists TSS positions for which this difference was ≥20%.
Acknowledgments
We thank Jared Knoblauch for assistance with data analysis. This work was supported by National Institutes of Health Grants GM041376 (to R.H.E.), GM088343 (to B.E.N.), GM096454 (to B.E.N.), and GM115910 (to B.E.N.).
Footnotes
↵1I.O.V. and H.V.-M. contributed equally to this work.
- ↵2To whom correspondence may be addressed. Email: ebright{at}waksman.rutgers.edu or bnickels{at}waksman.rutgers.edu.
Author contributions: I.O.V., H.V.-M., R.H.E., and B.E.N. designed research; I.O.V. and H.V.-M. performed research; I.O.V., H.V.-M., Y.Z., D.M.T., R.H.E., and B.E.N. analyzed data; and R.H.E. and B.E.N. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The sequence reported in this paper has been deposited in the NIH/NCBI Sequence Read Archive (accession no. SRP071742).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1603271113/-/DCSupplemental.
References
- ↵
- ↵
- ↵
- ↵
- ↵.
- Aoyama T,
- Takanami M
- ↵.
- Sørensen KI,
- Baker KE,
- Kelln RA,
- Neuhard J
- ↵.
- Jeong W,
- Kang C
- ↵.
- Liu J,
- Turnbough CL Jr
- ↵.
- Walker KA,
- Osuna R
- ↵
- ↵
- ↵
- ↵.
- Winkelman JT, et al.
- ↵.
- Winkelman JT,
- Chandrangsu P,
- Ross W,
- Gourse RL
- ↵.
- Zhang Y, et al.
- ↵.
- Vvedenskaya IO, et al.
- ↵.
- Maitra U,
- Hurwitz H
- ↵.
- Jorgensen SE,
- Buch LB,
- Nierlich DP
- ↵.
- Hawley DK,
- McClure WR
- ↵.
- Shultzaberger RK,
- Chen Z,
- Lewis KA,
- Schneider TD
- ↵.
- Wilson HR,
- Archer CD,
- Liu JK,
- Turnbough CL Jr
- ↵
- ↵.
- Tu AH,
- Turnbough CL Jr
- ↵.
- Walker KA,
- Mallik P,
- Pratt TS,
- Osuna R
- ↵
- ↵.
- Turnbough CL Jr,
- Switzer RL
- ↵.
- Tang H, et al.
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Crooks GE,
- Hon G,
- Chandonia JM,
- Brenner SE
- .
- Severinov K,
- Mooney R,
- Darst SA,
- Landick R
- .
- Vvedenskaya IO, et al.
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Biological Sciences
- Biochemistry