HflXr, a homolog of a ribosome-splitting factor, mediates antibiotic resistance

Significance Antibiotics have been widely used to treat bacterial infections and are also found in the environment. Bacteria have evolved various resistance mechanisms, allowing them to overcome antibiotic exposure and raising important health issues. Here, we report a bacterial antibiotic resistance mechanism, based on ribosome splitting and recycling, ensuring efficient translation even in presence of lincomycin and erythromycin, two antibiotics that block protein synthesis. This mechanism is mediated by a HflX-like protein, encoded by lmo0762 in Listeria monocytogenes, whose expression is tightly regulated by a transcriptional attenuation mechanism. This gene increases bacterial fitness in the environment. Our results raise the possibility that other antibiotic-induced resistance mechanisms remain to be discovered.


Bacterial strains, plasmids, primers and growth conditions
For standard experiments Listeria monocytogenes was grown overnight in Brain Heart Infusion (BHI) medium (Difco) at 37°C while shaking at 200 rpm. Overnight cultures were diluted 1/100 in fresh BHI and grown at 37°C until exponential phase (OD 600 = 0.6-0.8). When required, 0.03 µg/ml erythromycin or 0.25 µg/ml lincomycin was added to the culture medium, as these concentration are the sub-inhibitory concentrations that we previously determined in [1]. When indicated, the concentrations were increased to 0.5 µg/ml (lincomycin) and 0.06 µg/ml (erythromycin).
For RNA-seq and qRT-PCR, samples were collected before addition of antibiotic or after 15 min exposure. For western blot, samples were collected after 1h exposure.
Listeria mutants were generated using the pMAD shuttle plasmid[2] as described previously [3,4]. For the construction of pMAD and pAD-based plasmids, fragments obtained by PCR with EGD-e genomic DNA or synthetic DNA (Gblocks from Integrated DNA Technologies) were cloned into the EcorI/BamHI or NheI/BamHI sites of the pMAD, or SmaI/SalI sites of the pAD vector[5] derived previously from the pPL2 vector [6]. Plasmids constructs were confirmed by sequencing and transformed into L. monocytogenes by electroporation. For pAD, integration in the chromosome was verified by PCR using primers NC16 and PL95 [6].
Strains, plasmids and primers used in this study are listed in the table S4.

Preparation of total RNA
RNAs were extracted according to the FastRNA Pro protocol (Qbiogene), with slight modifications : three phenol/chloroform extraction in presence of 300 μL Tris HCl pH 7.5 100 mM were performed rather than using the Fast Pro Solution and chloroform extraction ; Fast prep was used twice for 45 seconds at speed 6.0 ; Sodium Acetate 0,3 M was added for RNA precipitation.

RNA-Seq library preparation and analysis
RNA-seq was performed as previously described [1]. Briefly, RNA was extracted as described above and was DNase treated and chemically fragmented. Strand specific RNA-seq libraries were prepared using the NEBNext® Ultra™ Directional RNA Library Prep Kit (NEB, E7420). Sequencing was performed using the Illumina NextSeq 500 and the data was deposited in the European Nucleotide Database (ENA) under accession no.

qRT-PCR.
cDNA were prepared as previously described [4]. Briefly, each sample was treated with DNase I (Turbo Q22 DNA-free kit, Ambion) and reverse-transcribed with Quantiscript Reverse Transcriptase (QuantiTect Reverse Transcription kit, Qiagen). qRT-PCR reactions were carried out and quantified with SYBR Green master mix on a C1000 Touch CFX384 machine (Biorad). Gene expression levels of genes were normalized to the L. monocytogenes gyrA gene, and the fold change was calculated using the delta-delta CT method. All samples were evaluated in triplicate and in at least three independent experiments. For statistics, we used an ordinary one-way or two-way Anova test (see legends) on ΔCt values, using biological replicates as pairing factors (* p<0.05; ** p<0.01, ns = non significant).

Western Blot
Bacteria were lysed in lysis buffer 150 mM KCl, 1 mM DTT, 25 mM Tris pH 7.4, Complete protease inhibitor cocktail (Roche) as described for preparation of total RNA. Concentration was estimated using the BCA protein assay kit (Thermo) or Bradford reagent (Bio-Rad). Proteins were loaded on 10% TGX stain-free gels (Biorad). Subsequent Western Blot experiment was performed as described in Mellin et al., 2014 [7], briefly after transfer on nitrocellulose membrane, a mouse anti-flag antibody (Sigma, F7425) (dilution 1:1000) was used, followed by HRP-antimouse antibody (AbCys, 1:4000 dilution), and signal was revealed using ECL Prime according to the manufacturer's instructions and subsequently visualized on the Chemidoc touch imaging system. After membrane stripping using Restore PLUS Western Blot Stripping Buffer (Thermo 46430) according to the manufacturer's instructions, EF-Tu antibody (rabbit) was used as described in [8] (1:10000), followed by HRP-antirabbit antibody (AbCys, 1:4000 dilution). The experiments were reproduced twice independently.

Minimum inhibitory concentration
Minimum inhibitory concentration assays were performed as previously described[1].
Briefly, 6-8 colonies were resuspended in BHI at OD 600 =0.001 in 96-well plates, and incubated in presence of increasing concentrations of antibiotics for 48h at 37°C without shaking and the MIC was determined as the lowest concentration to fully inhibit growth.
The experiment was reproduced at least three time independently.

Polysome profiles analysis
Listeria monocytogenes was grown overnight in Brain Heart Infusion (BHI) medium (Difco) at 37°C while shaking at 200 rpm. Overnight cultures were diluted 1/100 in fresh BHI and grown at 37°C. For ribosome splitting, erythromycin (0.18 µg/ml) was added or not (untreated condition) at OD 600 = 0.6 and the growth was pursued for 1h.
Chloramphenicol was quickly added to every cultures (2 min at 5 mM final concentration) in order to stabilize polysomes. The bacteria were pelleted by centrifugation (2 min at 28000 g), washed in 1mM Cam, and flash-frozen in liquid nitrogen. Pellets were resuspended in lysis buffer (20 mM Tris HCl pH 7.5, 100 mM NH 4 Cl, 10 mM MgCl 2 , 0.1% Nonidet p-40, 0.4% Triton x100, 1 mM Chloramphenicol) and cells were disrupted in the Fast-Prep apparatus, using 2 steps of 45 seconds shaking at 6.0 m/s, with a pause of 1 min at 4°C. Equal amount (15-35K) OD 260 of cell lysates were loaded on sucrose gradient (5-50%) and separated by ultra-centrifugation (37K during 2h46). Samples were collected from top (0mm) to bottom (80mm) of the tube at a speed of 0.12 mm/sec using the Biocomp instrument, and absorbance was read at 260 nm. Raw data were exported, the zero was set according to the inflexion point, and values were normalized according to the area under the curve. The percentage of 70S over total ribosomal fraction was determined by the area under the 70S peak divided by total area under the curve.

Data
We analyzed 9078 complete genomes retrieved from NCBI RefSeq

Identification of homologs
We took the two genes (lmo0762 and lmo1296) and analyzed their composition in protein domains using PFAM (last accessed 20 September 2018, https://pfam.xfam.org/). We realized that both proteins had three well conserved domains (GTP-bdg_M, GTP-bdg_N, MMR_HSR). We used these domains to search the database of genomes using hmmsearch (from the package HMMER 3.1b2) [9]. We collected all proteins with hits better than the threshold suggested by PFAM (--cut_ga option). This resulted in 189954 hits, of which 172890 for MMR_HSR, 8527 for GTP-bdg_M and 8537 for GTP-bdg_N.
A total of 8527 proteins had all three domains, showing that GTP-bdg_M and GTP-bdg_N almost always co-occur, and when they do there is always also a MMR_HSR domain. This suggests that the domain conservation of the protein is very high. To confirm these results, we also did a search for the protein using blastp v2.2.19+ (using a threshold e<10 -5 ), and fetched 8537 homologs in the genome database, of which 8527 had the three abovementioned PFAM domains. We thus used the 8527 proteins in all the following analyses.

Comparative genomics
We took all genomes from the database and separated them in species. Within each species, we initially filtered genomes that were either too divergent to be in the species or very similar to existing genomes (redundant). For this, we computed genetic distances between all genomes within a species using Mash v2.0 (default parameters, [10]). The first sequenced genome of a species was defined as the reference genome of the species. If a genome was more than 6% divergent to the reference genome, we removed it from further analyses, since at this level of divergence the strain is probably not from the same species.
Then we proceeded in the chronological list of complete genomes and added to the set of genomes to study all genomes with distances higher than 0.0001 to all the genomes already introduced in the set. This allows to remove redundant genomes resulting from multiple sequencing efforts of a similar strain. After this process, we obtained a list of 163 species for which we had at least five non-redundant complete genomes.
We computed the pan-genome of each species using mmseqs2 v 7319ccbb3ec80cbe6fa9d1b4c9527abea0e11e5c [11]. For this we computed pairwise similarities between all proteins and clustered them (--min-seq-id 0.80). We then computed the persistence of a given gene family in the pan-genome as the fraction of all genomes in the species containing at least one gene from the same gene family ( Fig. S10 and Dataset S3).

Phylogenetic trees
The dataset of 8527 proteins is too large to make efficient phylogenetic analyses.
Furthermore, it includes many proteins that are very similar because they correspond to homologs of species for which many genomes are available in the database (e.g., 99 for Listeria monocytogenes). We therefore started by reducing the redundancy of the dataset.
We used mmseqs2 to cluster the proteins in clusters of 80% identity (dataset_80, 1216 clusters) using options considered to lead to very accurate searches (-s 7.5 --min-seq-id 0.80) and 60% identity (dataset_60 contains 347 clusters obtained with options -s 7.5 -min-seq-id 0.60). We picked the representative of each cluster in each case.
We identified the genomes encoding at least two copies of the protein. For each species, we kept the two copies for one of the strains (the reference strain if it contained the copy, otherwise a random strain). We then added all these pairs of proteins to the two datasets above (dataset_80 and dataset_60) to ensure that the phylogenetic reconstruction contained the two copies of each pair of duplicates. This resulted in datasets with 1381 (dataset_80) and 594 (dataset_60) protein sequences.
We made a phylogenetic reconstruction for each of the two datasets in a similar way.
First, we made multiple alignments with with MAFFT v7.407 (with sensitive option -maxiterate 1000) [12]. We then collected the informative sites in multiple alignment using trimAl v1.2rev59 (option -gappyout) [13]. These were used to make a phylogenetic reconstruction with IQtree v1.6.7 (options -m TEST -bb 1000). The best-fit models were LG+I+G4 (dataset_60) and LG+F+I+G4 (dataset_80). The dataset_60 reconstruction was used to analyzed the data because it had more solid ultra-fast bootstrap results and fewer taxa. Yet, the key results were common between the two reconstructions.

Analysis of conserved neighborhoods
We analyzed the neighborhood of the genes in the clades of lmo0762 and lmo1296 in the phylogenetic tree computed with the dataset_60. We recovered the genes of the clades, corresponding respectively to 100 (lmo0762) and 64 (lmo1296) genes. For each gene, we recovered the 10 genes on each side of the gene in the replicon. We then clustered all these genes by sequence similarity (minimal 40% identity) for each clade separately using mmseqs2 (-s 7.5 --min-seq-id 0.4). We analyzed all families with more than 10 genes.    chloramphenicol. MIC values after 48h incubation are also indicated within each box (µg/ml). The crossed boxes correspond to strains that carry Cam R gene due to pAD insertion, and were not considered in this experiment.      W T a n t i -a n t i a n t i A T G W T a n t i -a n t i a n t i A T G W T a n t i -a n t i a n t i A T G W T a n t i -a n t i a n t i A T G