Three-component systems represent a common pathway for extracytoplasmic addition of pentofuranose sugars into bacterial glycans

Significance Pathogenic bacteria produce diverse polysaccharides that are important for virulence and can be exploited in the development of vaccines and immunotherapies. Resolving the mechanisms of polysaccharide biosynthesis is vital for understanding the molecular basis of antigenic diversity and for exploiting the pathways in glycoengineering applications. Here, we elucidate the enzymatic origin of α-ribofuranose residues in a prototypical system from gram-negative lipopolysaccharide O antigens and reveal unanticipated relationships to processes used in the assembly of mycobacterial cell walls. The periplasmic glycosylation pathway can introduce different pentofuranoses through the action of different epimerases. The identification of this pathway also expands the toolbox of enzymes that may be deployed in glycoengineering and recombinant production of polysaccharides with desired structures and properties.


Detailed experimental procedures
Construction of recombinant plasmids and mutant strains.KOD Hotstart polymerase (Novagen) was used to generate linear fragments by PCR.Oligonucleotide primers (obtained from Integrated DNA Technologies) are listed in Table S4.PCR products and plasmids were purified with the GeneJET purification kit and GeneJET plasmid miniprep kit, respectively (Fisher Scientific).Construction of site directed mutants was accomplished by inverse PCR of plasmids using primers including the desired mutation, which were designed in NEBasechanger (New England Biolabs).The linear fragment was then treated with DpnI, T4 polynucleotide kinase, and T4 DNA ligase, according to manufacturer's recommendations, and transformed into E. coli DH5α.The sequences of plasmid constructs were confirmed at the Advanced Analysis Centre, Genomics Facility, University of Guelph.
Antiserum production.Rabbit antiserum recognizing C. youngae O1 was produced using formalin-killed C. youngae O1 PCM1492 cells as the antigen.Cells were suspended in 0.85% (w/v) NaCl at ~10 8 cfu/ml and mixed 1:1 with Freund's incomplete adjuvant (Sigma) and injections were performed intramuscularly in a New Zealand white rabbit every two weeks for 6 weeks total.After 6 weeks blood was collected and the antiserum was stored at -80°C.Animal handling and immunization was performed at the University of Guelph Central Animal Facility.To prepare antiserum specific for the Ribfmodification, the O1 antiserum was adsorbed using C. youngae O1 PCM1492 Δorf12 cells.Cells from 200 mL of overnight culture were collected and resuspended in 16 mL sterile PBS and divided into 4 equal aliquots.Cells were collected from one aliquot and the pellet was resuspended in 5.2 mL of antiserum and incubated at 37°C for 1 hour before the cells were removed.The process was repeated with the remaining 3 aliquots.
The adsorbed antiserum was then frozen in aliquots at -20°C.

SDS-PAGE and immunoblotting.
Proteinase K-digested whole-cell lysates were used for examination of LPS by SDS-PAGE as described previously (1).An amount of cells equivalent to 1 A600nm unit was collected and resuspended in loading buffer, heated at 100°C for 10 minutes, and subsequently treated with proteinase K at 55°C for 1 hour.
Samples were separated on 12% gels with Tris-glycine buffer.LPS was visualized by silver staining or by western immunoblotting after transfer to nitrocellulose membranes (Protran, GE Healthcare).Transfer was performed using a constant current of 200 mA for 45 minutes in buffer containing 25 mM Tris, 150 mM glycine, and 20% (v/v) methanol.Membranes were blocked in 5% skim milk (BD Difco) prepared in TBST (10 mM Tris-Cl, pH 7.5, 150 mM NaCl, 0.005% (v/v) Tween 20).Rabbit anti-O1 and Ribfspecific antiserum were used at 1:1000 dilutions in TBST containing 5% (w/v) skim milk powder.Goat anti-rabbit-conjugated alkaline phosphatase (1:3000) (Cedar Lane) was used as a secondary antibody and detection was performed using nitroblue tetrazolium and 5-bromo-4-chloro-3-indolyl phosphate (Roche Applied Science).Proteins were analyzed by mixing samples with an equal volume of loading buffer and separating by SDS-PAGE as above.Detection was achieved using Coomassie blue staining using the Pierce Power Stainer.The SeeBlue Plus2 protein ladder provided molecular weight markers (Invitrogen).
Purification of OPS.LPS was purified using the hot water-phenol method from cells grown in 10 L cultures (2).After phase-separation, the LPS-containing aqueous fraction was dialyzed against water to remove residual phenol.Proteins and nucleic acids were removed by precipitation using cold, aqueous trichloroacetic acid followed by centrifugation at 12,000 × g for 20 min.The supernatant was then dialyzed against water until neutral pH was achieved and the solution was concentrated in a rotovap and the LPS was lyophilized.OPS was isolated by hydrolysis of purified LPS at 100 °C in 2% (v/v) acetic acid, followed by centrifugation to remove the lipid precipitate.The carbohydrate-containing supernatant was then separated on a Sephadex G-50 superfine column (2.5 cm × 75 cm) in 50 mM pyridinium acetate buffer (pH 4.5) at a flow rate of 0.6 ml/min.Elution was monitored with a Smartline 2300 refractive index detector (Knauer) and fractions were collected at 10 min intervals.Fractions containing to the OPS peak were combined and concentrated.
Membrane preparation and protein purification for in vitro investigations.To prepare membranes for in vitro reactions, cultures were grown to A600nm 0.6 and expression of plasmid-encoded proteins was induced overnight at 18°C following addition of 0.2% ʟarabinose.Cells were collected by centrifugation and resuspended in buffer containing 50 mM HEPES, 500 mM NaCl at pH 7.4, prior to lysis by sonication.Lysates were cleared by centrifugation at 12,000 × g for 20 min, and membranes were collected by centrifugation of the supernatant at 100,000 × g for 1 h.Membranes were resuspended in buffer A with 150 mM NaCl and total protein concentration was measured using the DC protein assay (Bio-Rad).ORF12 503-652 -His6 was purified by passing cleared cell lysate through a column with Ni-NTA resin.The column was washed using buffer A with 10 mM imidazole, followed by a second wash with buffer A containing 30 mM imidazole.
Protein was eluted using buffer A containing 250 mM imidazole.Purified ORF12 503-652 -His6 was buffer exchanged into buffer B (50 mM HEPES, 150 mM NaCl at pH 7.4) and concentrated with a Vivaspin 20 with a 10 kDa molecular weight cut-off.Protein concentration was estimated using a nanodrop with a theoretical extinction coefficient obtained from Protparam (https://web.expasy.org/protparam/).
For ORF12 reactions 2 ((2Z,6Z)-farnesylP-β-D-Ribf) or 4 ((2Z,6Z)-farnesylP-β-D-Xylf) were used as donor for reaction with 3 as an acceptor (Figure S13,S14,S15).Reactions contained aliquots of membrane containing 20 μg of total protein.Reaction mixtures were incubated at 30 °C for 18 hrs.Reactions were stopped by the addition of an equal volume of cold acetonitrile and precipitated protein was removed by centrifugation at 12,000 × g for 5 min.For purification of farnesylP-Ribf for NMR, a large-scale reaction was performed in a 6.6 mL reaction.The reaction mixture was dried and resuspended in methanol and separated using silica gel (pore size 60Å, 230-400 mesh particle size, 40-63 µm particle size) eluting in 1:1 methanol-ethyl acetate.
The reaction mixtures were separated by normal phase HPLC as described previously (4,5), using an Agilent 1260 Infinity II LC system.10 μL of sample was injected and separated with a GLYCOSEP N column (4.6 × 250 mm, Prozyme).Solvent A contained 10 mM ammonium formate pH 4.4 in 80% acetonitrile, solvent B contained 30 mM ammonium formate pH 4.4 in 40% acetonitrile, and solvent C contained 0.5% formic acid.The separation sequence was as follows: a linear gradient of 100% A to 100% B over 160 min (0.4 ml/min), followed by a 2 min gradient of 100% B to 100% C.; returning to 100% A over 2 min and holding for 15 min (1 ml/min); followed by 0.4 ml/min for 5 min in A. The column temperature was 30°C and elution was monitored at 260nm.All HPLC analysis was performed using OpenLAB revision A.02.16 (Agilent).
TLC was performed by spotting 3 μL reaction mixture onto an aluminum foil silica gel 60 F 254 TLC plate (EDM Millipore).TLC plates were developed using ethyl acetate:butanol:water:acetic acid (25:40:20:12.5).The plates were dried and then dipped in a solution of 3.7% (v/v) sulfuric acid and 1.1% (v/v) glacial acetic acid made in anhydrous ethanol and then heated using a hairdryer until dark blue spots appeared.
TLC plates were then imaged with a ChemiDoc (BioRad).
Mass spectrometry.LC-MS was performed with an Agilent 1260 HPLC liquid chromatograph interfaced with an Agilent UHD 6530 Q-TOF mass spectrometer in the University of Guelph Advanced Analysis Centre.In vitro reaction mixtures were filtered using a Microcon-10 kDa centrifugal filter, or extracted with an equal amount of chloroform followed by centrifugation.Samples were separated on a C18 column (Agilent Poroshell 120, EC-C18 50 mm × 3.0 mm, 2.7 μm) with the following solvents: water with 0.1% formic acid (A) and acetonitrile with 0.1% formic acid (B).The mobile phase gradient was as follows: initial conditions were, 10% B for 1 min; increasing to 100% B in 29 min; column wash at 100% B for 5 min; followed by 20 min reequilibration.The flow rate was maintained at 0.4 mL/min.The mass spectrometer electrospray capillary voltage was maintained at 4.0 kV, and the drying gas temperature at 250 °C, with a flow rate of 8 L/min.Nebulizer pressure was 30 psi and the fragmentor was set to 160.Nozzle, skimmer, and octapole RF voltages were set at 1000 V, 65 V and 750 V, respectively.Nitrogen (purity >98%) was used as nebulizing, drying, and collision gas.The mass-to-charge ratio was scanned across the m/z range of 50-1500 m/z in 4 GHz (extended dynamic range) positive ion mode.The acquisition rate was set at 2 spectra/s.The mass axis was calibrated using the Agilent tuning mix HP0321 (Agilent technologies) prepared in acetonitrile.Sample injection volume was 10 μL.
Nuclear magnetic resonance spectroscopy.NMR analysis of OPS and in vitro reaction products was performed at Advanced Analysis Centre, University of Guelph.All OPS samples subjected to NMR analysis were deuterium exchanged by lyophilizing twice with 99.0%D2O.Spectra were recorded in 99.9% D2O with chemical shifts referenced to a 3-trimethylsilylpropanoate-2,2,3,3-d4 (δH 0 ppm, δC -1.6 ppm) internal standard.NMR spectra were obtained at 50°C.NMR analysis of farnesylP-Ribf was performed in MeOD, at 22°C with trace MeOH as an internal standard.
CBM-mediated binding assay.LPS pulldown assays were performed as described previously (6).Briefly, 1 ml reaction volumes containing 200 μg of LPS and 200 μg of ORF12 503-652 in buffer B (25 mM Bis-Tris, pH 7.0, 250 mM NaCl) were incubated on a rotary shaker for 30 minutes at room temperature.This mixture was added to a 50 μL aliquot of PureProteome Nickel Magnetic Beads (Millipore), which were pre-equilibrated with buffer B, and incubated for 30 min at room temperature on a rotary shaker.The beads were collected with a magnet and washed three times with 500 μL of buffer B.
Protein was eluted stepwise from the beads using three washes with 100 μL buffer B containing 500 mM imidazole.The loading, wash and elution fractions were examined by silver-stained SDS-PAGE and western immunoblotting.
Bioinformatics analyses.Multiple sequence alignments were performed in ClustalW (7) and TCoffee (8) (where indicated), and visualized using ESPript (9).Transmembrane helix prediction was performed using DeepTMHMM (10) and structural models were made using AlphaFold through the Colabfold notebook (11,12).All AlphaFold models were individually assessed and contained pLDDT values greater than 80 across the majority of the proteins, classifying them as confident models.Structural conservation was shown using Consurf (13) and 3D structure alignments were performed using Dali (14).Protein visualization was performed in PyMol v2.5.4. a Figure S1: NMR spectroscopic analysis of OPS from C. youngae O1 and O2 mutants and the corresponding mutants complemented with plasmids carrying the relevant genes.a, overlay of 1 H, 13 C HSQC spectra from C. youngae O1 Δorf12 (blue) and complement (red).b, overlay of 1 H, 13 C HSQC spectra from C. youngae O1 Δorf15 (blue) and the plasmid-complemented mutant (red).Minor peaks observed in the complemented orf12 and orf15 mutants reflect non-stoichiometric modification of Ribf c addition reported previously (15,16), and peak integration comparing the backbone repeat unit and the Ribf confirmed the side chain Ribf to be attached to 50% of the main chain repeat units.The proposed methylphosphate modification is labelled on each 1 H spectrum; this signal appears as a doublet due to splitting by the 31 P atom in the phosphate.This is a result of the methyl group neighbouring the NMR active 31 P atom, which causes spin coupling and peak splitting into n+1 (two) peaks where n is the number of neighbouring NMR active atoms.c, overlay of 1 H, 13 C HSQC spectra from C. youngae O2 Δorf13 (red) and complement (blue).Minor peaks were observed, consistent with the non-stoichiometric modification (creating unmodified repeating units) as well as incomplete epimerization of Xylf to Ribf resulting in a small amount of Ribf modified glycan chains in the transformant.      (lacking the CBM) to retore the Ribf side chain epitope in theorf12 mutant.Whole-cell lysates were separated by PAGE and analyzed with silver staining or western immunoblotting with antiserum specific for the Ribf side chain c, the activity of ORF12 1-502 was also tested in vitro.Reaction products were conducted using synthetic farnesyl-phosphoribose (2) and BODIPY-tagged O1 repeat unit mimic (3), and separated using HPLC with a GlycoSep N column and normal phase separation.The polysaccharide repeat-unit structures are shown and the genes encoding enzyme homologs are coordinated by color (PPPRS in salmon, GT-C in purple, transporter in pink and Xylf/Araf epimerases in orange and green respectively).A genomic sequence was available for Agrobacterium tumefaciens (since renamed Rhizobium radiobacter) DSM 30205=NCTC 13543 (GCA_900455815), but orthologs of the three-component system proteins were not identifiable.OPS and EPS designate O-polysaccharide and exopolysaccharide, respectively.The glycan from Devosia submarinarum was a component of extracted "cell wall polysaccharides" (18) and is structurally related to mycobacterial arabinan and P. aeruginosa O-glycans, so may share similar extracytoplasmic biosynthesis.Sugars are represented using the symbol nomenclature for glycans (SNFG) (19).Abbreviated sugars are as follows: Ribf (ribofuranose), Xylf (xylofuranose), Araf (arabinofuranose), Xylp (xylopyranose), Arap (arabinopyranose), Man (mannose), Galp (galactopyranose), Glc (glucose), Galf (galactofuranose), ManNAc (N-acetylmannosamine), GlcNAc (N-acetylglucosamine), GlcN (glucosamine), GalNAc (N-acetylgalactosamine), ᴅ-Rha (ᴅ-rhamnose), ʟ-6d-Tal (6-deoxy-talose), R-Lac   General Methods.All reagents were purchased from commercial sources and used without further purification unless otherwise noted.Solvents used in reactions were purified by passage through alumina and copper columns under argon.Reactions were monitored by TLC on silica gel G-25 F254 (0.25 mm) and TLC spots were detected under UV light and charring after staining with p-anisaldehyde in ethanol, acetic acid, and H2SO4.Column chromatography was performed on silica gel 60 (40-60 μM) unless otherwise noted. 1 H NMR spectra were recorded at 500 MHz, 600 MHz or 700 MHz and were referenced to the solvent residual proton signal of CDCl3 (7.26 ppm), D2O (4.79 ppm) or CD3OD (3.30 ppm). 13C{ 1 H} NMR spectra were recorded at 126 MHz, 151 MHz or 176 MHz and were referenced to CDCl3 (77.06 ppm), external acetone (31.07 ppm, D2O), or CD3OD (49.00 ppm). 31P NMR spectra were 1 H decoupled and were recorded at 202 MHz.Protons and carbons on oligosaccharides were labelled starting from the reducing end with non-primed atoms and proceeding towards the non-reducing end with a prime (') added for each successive carbohydrate residue.Peak assignments were made based on 2D NMR analysis ( 1 H-1 H COSY, HSQC and HMBC).Optical rotations were measured at 22 ± 2 °C at the sodium D line (589 nm) in a microcell (10 cm, 1 mL) and are in units of deg•mL(dm•g) -1 .Electrospray ionization (ESI) mass spectra were recorded using samples in mixtures of THF with CH3OH and added NaCl.For high resolution mass determination, spectra were obtained by voltage scan over a narrow range at a resolution of approximately 10,000.A solution of S1 (1 g, 3.14 mmol, 1.0 equiv) and p-thiocresol (585.0 mg, 4.71 mmol, 1.5 equiv) in dry CH2Cl2 (30 mL) was cooled to 0 ºC and then BF3•OEt2 (0.5 mL, 4.71 mmol) was added dropwise.After stirring for 1 h at rt, the reaction was cooled to 0 ºC before satd aq sodium bicarbonate was added to quench the excess BF3•OEt2.The mixture was concentrated and dissolved again in EtOAc then was washed with satd aq sodium bicarbonate twice and then brine.

Figure S2 :Figure S3 :
Figure S2: Overlay of the active site of the solved structure of the DPPRS from Mycobacterium tuberculosis (Rv3806c) with the AlphaFold model of C. youngae O1 PPPRS.Overlay of the AlphaFold model was performed using Dali with the PRPP bound state of Rv3806c (PDB:8j8k).The lipid donor state (PDB:8j8j) was aligned to the PRPP bound state in Pymol to position the decaprenol-phosphate (DP) donor in the active site in a single model with PRPP.Active site residues are shown as green sticks for Rv3806c and the corresponding residues from the ORF15 C model are shown has pink sticks with the corresponding residues labelled in brackets.Mg ++ is shown as a sphere.

Figure S4 :Figure S5 :Figure S6 :
Figure S4: Models of MATE and SMR-family transporters.The models are shown as homo/heterodimers with each chain coloured differently.The structure of EmrE has been solved (7MH6).EmrE adopts a head-to-tail dimerization of subunits, a hallmark of the MATE transporter family.ORF14 (AWU66545.1) is predicted to be a MATE transporter with similarity to ArnEF (Q47377, P76474).AlphaFold modelling of ORF14 and ArnEF predict a similar anti-parallel organization of subunits to the EmrE prototype.AlphaFold models of the mycobacterial decP-Araf transporter (Rv3789 -P9WMS9) and the undP-Glc transporter (GtrA -P77682) adopt an SMR fold, lacking the characteristic antiparallel orientation seen in the MATE family.

Figure S10 :Figure S11 :
Figure S10: Silver-stained PAGE of LPS in whole-cell lysates of C. youngae O2 Δorf13 and the mutant complemented using a plasmid carrying orf13.The observed LPS phenotype of identical length is consistent with the maintenance of the side-chain modification, consistent with the promiscuity of ORF12.

G
Figure S12: Genetic loci encoding candidate three-component α-pentofuranose side chain addition systems.These are present in clusters directing the synthesis of glycans with unknown structure.C. youngae O1 ORF15 was used as a seed in a tBLASTn search.Sequences flanking and nearby orf15 homologs were scanned for genes predicted to encode a transporter and a GT-C enzyme.Notably most GT-C genes were labelled as hypothetical proteins, but the AlphaFold models of close homologs found in Uniprot allowed confident assignments of GT-C gene products.In one example (in Caudoviricetes) the PRP domain is separated from the PPPRS.The transporters in the systems varied, containing either a MATE family transporter (like C. youngae O1) or an SMR family transporter.Finally, several more elaborate systems were identified which contained multiple different epimerases and GT-C enzymes, suggesting these systems may be capable of more elaborate side-chain additions.Accessions for the identified systems are as follows: Caudoviricetes sp.ct9Mr3 (BK026208.1),Burkholderia multivorans 2010Ycf41 (CP090754.1),Escherichia coli RHB45-C06 (CP099034.1),Rhizobium sp.CCGE531 (CP032687.1),Xanthomonas citri CFBP 6533 (CP110301.1),Bacillus sp.strain NP157 (CP076546.1).

a
restriction endonuclease cut sites are underlined b DNA sequence encoding protein tags are bolded c Mutations for alanine variants are in lower case Supplementary Methods -Synthesis of 2-4.

Table S1 :
Predicted function of proteins encoded by C. youngae O1 OPS cluster based on similarity to characterized enzymes.

Table S2 :
1H and13C NMR chemical shifts for C. youngae mutants and complements