Caulobacter crescentus Hfq structure reveals a conserved mechanism of RNA annealing regulation

Significance In many bacteria, the RNA chaperone protein Hfq binds to hundreds of small noncoding RNAs and improves their efficacy by aiding base pairing to target mRNAs. Hfq proteins contain a variable C-terminal domain (CTD), usually structurally disordered, which was recently demonstrated to inhibit Hfq from mediating nonspecific RNA annealing. We obtained a new structure that shows how this inhibition is achieved in Caulobacter crescentus Hfq. The structural data and chaperone assays provide an initial view of the little-known mechanism of small RNA regulation in Caulobacter. In addition, this work demonstrates how the Hfq CTD has evolved to meet the needs for species-specific selectivity in RNA binding and pairing of regulatory RNAs with cognate targets.

passages through an Emulsiflex (Avestin) at 500 bar, and cell lysate was clarified by centrifugation at 30,000 x g for 30 minutes. The clarified lysate was loaded either onto a 5 ml glutathione sepharose column (for pGEX-6p1-Cc Hfq) in GST-buffer A (20 mM Tris, pH 7.5, 200 mM NaCl) and following washing with GST-buffer A was eluted with 15 ml of GST-buffer B (20 mM Tris, pH 7.5, 200 mM NaCl, 50 mM reduced glutathione). The eluate was then incubated overnight at 4°C with 1:50 molar concentration of PreScission protease (GE life science) to cleave the GST tag. For petDUET1-Cc Hfq and EcCc Hfq, the lysate was loaded onto a 5 ml Hi-trap Q column (GE) in Q-buffer A (20 mM Tris, pH 7.5) and Hfq was eluted in a linear gradient to 100 % Q-buffer B (20 mM Tris pH 7.5, 1M NaCl). For intein-CC Hfq78 the lysate was loaded on to a 5ml chitin resin column in GST buffer A, and following washing with the same buffer the column was flushed with 3 column volumes of GST buffer A supplemented with 40 mM DTT, then the column was capped and left for 48 hours at 4°C to induce cleavage of the intein tag. The liberated Cc78 Hfq was subsequently eluted with 3 column volumes of GST buffer A supplemented with 40 mM DTT. For all protein samples, fractions containing Hfq were diluted four-fold in Q-buffer A, and applied to a 5 ml Heparin column, and again, Hfq was eluted in a linear gradient to 100 % Q-buffer B. Finally, fractions containing Hfq were concentrated and loaded onto a Superdex 200 gel filtration column. Details of all DNA oligonucleotides, bacterial strains and plasmids used in this study are summarised in Supplementary tables 3-5.

Computational modelling of the disordered termini
Rosetta FloppyTail was used to model the disordered termini of C. crescentus Hfq. Our approach was based on the original FloppyTail method (3) and modified as previously described (4).
Before FloppyTail modelling, the ordered core (residues 9-67) were extracted from chain B of the deposited crystal structure (PDB ID 6GWK), the N-and C-termini were initialized in an extended βstrand conformation, and these models were energy-minimized using Rosetta FastRelax (5) with harmonic restraints to the initial backbone positions limiting motion to less than 1 Å backbone RMSD. FloppyTail simulations were then run on this input to generate hypothetical, low-energy conformations of the disordered regions. Briefly, FloppyTail is a Monte Carlo method that is primarily composed of random sampling of backbone and side-chain dihedral angles, with occasional gradient-based energy minimization. In a single simulation, generating one possible model, ~500 of these moves are attempted per residue. Typically, 30,000 models are generated, but only the lowest 1% by energy are retained for analysis.
We evaluated pairwise residue-residue interactions using PyRosetta (6). Energies were determined using the REF2015 energy function (7) which captures van der Waals, solvation, hydrogen bonding, and electrostatic interactions. Only stable pairwise interactions (E < -2.0 REU) were considered in our analysis. We determined the probability for a residue to interact with any acidic CTD residue and the average energy of such an interaction. To compute averages and standard deviations we used to bootstrapping, resampling the low-energy models 100 times. Our analysis was previously described (4).

Comparison of the disordered termini models to crystal structures
We recognize that crystal structures may not accurately capture the conformations of disordered protein regions. Nonetheless, we were interested in comparing the CTD conformations in the FloppyTail models to those observed in the crystal structures. From the crystal structures, we chose the subunit with the most resolved residues for further comparison (chain E from PDB ID 6GWK).
Each subunit from each model was aligned using to the reference subunit from the crystal structure using the McLachlan algorithm (as implemented in the program ProFit (version 3.1) (8). Alignments were done over the Sm-like domain (residues 6-67) and Cα RMSDs were computed over the Sm-like domain and CTD (residues 6-87 for C. crescentus, 6-76 for the C2 chimera, and 6-75 for the P2 chimera).

Contact analysis
In addition to computing Cα RMSDs, we analyzed residue-residue interactions, between residues in the CTD and core, in both crystal structures and models. We used PyRosetta, evaluating interactions with the REF2015 energy function, and determined residue pairs with the NeighborhoodResidueSelector using a 10 Å cutoff.

Experimental details relating to in vitro RNA annealing assays
In annealing experiments presented in this manuscript, the Hfq variants were first incubated with target RNAs before being rapidly mixed with molecular beacon to begin the annealing measurements. In past experiments, however, Hfq was instead pre-incubated with molecular beacon (2,4). Intriguingly this reversed order of addition gives rise to three differences with respect to previously collected data. First, annealing data for Target-A18 is simpler and can be fit by singleor double-exponential equations, rather than triple-exponential equations, since the fastest phase has been eliminated. Secondly, Ec65 is unable to anneal Target-A18 to molecular beacon at any ratio of Hfq:beacon. Thirdly, in previous experiments Ec Hfq increased the yield of annealed molecular beacon more than Ec65 (2), however the reverse is true for most cases when Hfq is first incubated with the target RNA prior to mixing with molecular beacon. Previous reports suggest that Hfq binds sRNAs and mRNAs in a random order. However, it is important to note that in at least some cases, such as the sRNA-mRNA pair MicC-ompC, mRNA‧Hfq binary complexes may be dead-ends that are unable to proceed to a functional ternary complex (9). A similar effect may be seen in vitro with these simple RNA oligomers in which annealing of Target-A18 to beacon by Ec Hfq and Ec65 is much slower when Target-A18 is allowed to bind to Hfq first, instead of beacon ( Figure 4). Additionally, a 2-fold excess of target RNA (100 nM) over beacon concentration (50 nM) is used in these experiments. Therefore, pre-incubation of Hfq with target RNAs may result in the saturation of Hfq‧ target binary complexes, and may enable the formation of erroneous target‧Hfq‧target ternary complexes. The latter, non-complementary ternary complexes, may be more prone to forming with Hfqs lacking inhibitory CTDs (Ec65 and Cc78), and at low Hfq6:beacon concentrations.

Identification of Hfq variants by liquid chromatography tandem mass spectrometry
E. coli strains were grown in LB to OD600 of 1.0, and harvested by centrifugation. Cells were washed twice with ice-cold PBS and snap-frozen in liquid nitrogen. Bacteria were lysed by thawing in 50mM Hepes with added protease inhibitor (Liquid chromatography tandem mass spectrometry ™ Protease Inhibitor Cocktail), followed by sonication (Branson Ultrasonics). Cell debris was removed by centrifugation (15 min; 4 °C; 16 000 g), and protein concentration was determined by bicinchoninic assay (BCA, ThermoFisher Scientific). Proteome aliquots of 200 μg were reduced with 10 mM DTT (30 min, 37 °C), alkylated with 50 mM iodoacetamide (30 min, RT, in the dark). Samples were diluted four times with 50 mM ammonium bicarbonate prior to trypsin digestion (proteome/enzyme ratio 100:1 w/w) at 37 °C overnight. Resulting peptides were desalted with home-made C18 stage tips (10), vacuum dried to near dryness, and stored at -80 °C.
Liquid chromatography tandem mass spectrometry (LC-MS/MS) was performed on a nano-LC-system (Ultimate 3000 RSLC, Thermo) coupled to a high-resolution quadrupole Time-Of-Flight mass spectrometer (Impact II, Bruker). The nano-LC system was equipped with an Acclaim Pepmap nano-trap column (C18, 100 Å, 100 μm × 2 cm, Thermo) and an Acclaim Pepmap RSLC analytical column (C18, 100 Å, 75 μm × 50 cm, Thermo). The peptide mixture was eluted over a 75 minute          (Table S6); these were adjusted to more accurately measure the affinity of the given RNA-Hfq interaction. RybB, cfa and hns EMSAs were run under lower current during electrophoresis, and therefore these Hfq-RNA complexes do not show as much dissociation during electrophoresis.    Figure 5), or beacon (right; 2, 17) before being rapidly mixed with the complementary RNA (beacon or Target-A18 respectively). The reaction kinetics for this pair of RNAs are simpler when Hfq is pre-incubated with Target-A18 first, rather than beacon. Although there is a greater difference in the annealing rates between the two Hfqs when they are preincubated with Target-A18, the general trend is conserved between both orders of RNA substrate addition. Progress curves we normalized to the initial fluorescence reading (F0).

Figure S12. Yield of annealed target•beacon product by different Hfq variants. Relative amount of
annealed molecular beacon product generated after a maximum of 500 seconds, for increasing ratios of (Hfq)6:Beacon. Data is shown for E. coli (solid bars) or C. crescentus (hatched bars) Hfqs bearing no acidic CTD tip (light blue), a CTD with a distantly tethered acidic tip residues (orange), or a CTD with a closely tethered acidic tip residues (red). Annealing reactions which were not complete within 500 s are noted with a black dot overhead. Poor yield can arise from sequestration of beacon and target RNA on separate Hfq hexamers, especially in high protein:RNA ratios, that limits the formation of productive ternary complexes. This is partially overcome by U6 and A18 binding sites for Hfq.
Kd and Hill (n) coefficients are the midpoints and gradients, respectively, of two independent experiments at 30 °C in 34 mM Tris-HCl pH 7.5, 50 mM NaCl, 50 mM KCl, 50 mM NH4Cl, 11.4 mM EDTA, 12 % glycerol, 0.005 % bromophenol blue, 0.005 % xylene cyanol FF). Hfq binding was measured by EMSA (see Figure S8 and Methods). The fraction of tightly bound RNA was fit to twostate cooperative binding isotherms. More elaborate binding models did not produce statistically more reliable results for those reactions that resulted in more than one tightly bound Hfq-RNA complex. Three-state binding models were used to fit binding reactions of Cc78 Hfq with sRNAs R157, R133, and ChvR, or Cc, Cc78, and EcCc Hfq with RydC sRNA. These reactions populate weakly bound complexes at low Hfq concentrations that were treated as a separate bound state in the models (see Methods). The parameters in the table are for the formation of tightly bound complexes. b Not determined. c Binding was not cooperative (n = 1). hns gAACAAACCACCCCAAUAUAAGUUUGAGAUUACUACAAUGAGCGAAGCACUUA AAAUUCUGAACAACAUCCGUACUCUUCGUGCGCAGGCAAGAGAAUGUACACU UGAAACGCUGGAAGAA a Lower-case letters represent non-natural guanosines added to the 5 end of the sRNA or mRNA to aid in vitro transcription. mRNA start codons are shown in bold. The first nucleotise of cfa mRNA (adenosine) was replaced with a single guanosine. The RybB sequence was previously used in binding assays (18).    Lane  1  2  3  4  5  6  7  8  9