New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Genomic analysis of siderophore β-hydroxylases reveals divergent stereocontrol and expands the condensation domain family
Edited by J. Martin Bollinger Jr., The Pennsylvania State University, University Park, PA, and accepted by Editorial Board Member Stephen J. Benkovic August 23, 2019 (received for review February 22, 2019)
This article has a Correction. Please see:

Significance
Bacteria produce siderophores to sequester iron(III). Genome mining of nonribosomal peptide synthetases predicts partial structures of peptidic siderophores; however, current tools cannot reliably predict which aspartate and histidine residues will be hydroxylated to form bidentate chelating groups, nor the resulting stereochemistry. We identified 2 functional subtypes of nonheme Fe(II)/α-ketoglutarate–dependent aspartyl β-hydroxylases in siderophore biosynthetic gene clusters and one type of histidyl β-hydroxylase. Stand-alone genes encode one class of aspartyl β-hydroxylases and the histidyl β-hydroxylases, while the second class of aspartyl β-hydroxylases is encoded within a domain of a nonribosomal peptide synthetase (NRPS) gene. Each aspartyl β-hydroxylase subtype effects distinct diastereoselectivity. Mapping the β-OHAsp diastereomers in siderophores to the phylogenetic tree of β-hydroxylases enables prediction of β-OHAsp stereochemistry in silico.
Abstract
Genome mining of biosynthetic pathways streamlines discovery of secondary metabolites but can leave ambiguities in the predicted structures, which must be rectified experimentally. Through coupling the reactivity predicted by biosynthetic gene clusters with verified structures, the origin of the β-hydroxyaspartic acid diastereomers in siderophores is reported herein. Two functional subtypes of nonheme Fe(II)/α-ketoglutarate–dependent aspartyl β-hydroxylases are identified in siderophore biosynthetic gene clusters, which differ in genomic organization—existing either as fused domains (IβHAsp) at the carboxyl terminus of a nonribosomal peptide synthetase (NRPS) or as stand-alone enzymes (TβHAsp)—and each directs opposite stereoselectivity of Asp β-hydroxylation. The predictive power of this subtype delineation is confirmed by the stereochemical characterization of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin. The l-threo (2S, 3S) β-OHAsp residues of alterobactin arise from hydroxylation by the β-hydroxylase domain integrated into NRPS AltH, while l-erythro (2S, 3R) β-OHAsp in delftibactin arises from the stand-alone β-hydroxylase DelD. Cupriachelin contains both l-threo and l-erythro β-OHAsp, consistent with the presence of both types of β-hydroxylases in the biosynthetic gene cluster. A third subtype of nonheme Fe(II)/α-ketoglutarate–dependent enzymes (IβHHis) hydroxylates histidyl residues with l-threo stereospecificity. A previously undescribed, noncanonical member of the NRPS condensation domain superfamily is identified, named the interface domain, which is proposed to position the β-hydroxylase and the NRPS-bound amino acid prior to hydroxylation. Through mapping characterized β-OHAsp diastereomers to the phylogenetic tree of siderophore β-hydroxylases, methods to predict β-OHAsp stereochemistry in silico are realized.
The vast majority of life requires iron as an enzyme cofactor. Because Fe(III) is nearly insoluble under aerobic physiological conditions, bacteria have evolved a variety of mechanisms to meet their metabolic iron needs, including the biosynthesis of small-molecule, high-affinity iron chelators called siderophores (1). Some peptidic siderophores contain β-hydroxyaspartate (β-OHAsp), which provides bidentate OO′ coordination to Fe(III) (2). The first structural determination of a β-OHAsp–containing siderophore came with the crystallization of ferric pyoverdine from Pseudomonas B10 (SI Appendix, Fig. S1) in 1981 (3). Since then, a variety of peptidic siderophores with β-OHAsp have been characterized from both marine and terrestrial bacteria. Like other α-hydroxycarboxylate ligands, β-OHAsp bound to Fe(III) can undergo photoinduced reduction of Fe(III) to Fe(II) accompanied by oxidative decarboxylation of the ligand (2, 4).
Far fewer siderophores contain the chelating group β-hydroxyhistidine (β-OHHis). The first reported example is pyoverdine pf244 of Pseudomonas fluorescens 244 (SI Appendix, Fig. S1) (5). β-OHHis has since been identified in the peptide of pyoverdines from a variety of pseudomonads (6⇓–8). Some Pseudomonas strains produce the fatty-acyl peptidic siderophores corrugatin, ornicorrugatin, or histicorrugatin (SI Appendix, Fig. S1), which contain both β-OHHis and β-OHAsp (9⇓–11). One β-hydroxyasparagine–containing siderophore has been reported, pyoverdine VLB120 from Pseudomonas taiwanensis VLB120 (SI Appendix, Fig. S1) (12, 13).
Peptidic siderophores are synthesized by large, multidomain nonribosomal peptide synthetase (NRPS) enzymes (14). The core domains of a NRPS are adenylation (A), thiolation (T), and condensation (C). The catalytic cycle begins with the activation of a select carboxylic acid by the A domain; this substrate, most often an l-amino acid, can be predicted by sequence analysis (15). After activation, the acid adenylate is transferred to the phosphopantetheine (Ppant) tether of the T domain. Each peptide bond is formed by a C domain, which condenses 2 T-bound substrates. The growing peptide chain travels down the NRPS “assembly line,” and the final peptide is released at the C terminus of the NRPS, commonly by a thioesterase (Te) domain. Peptide bond-forming C domains are canonical members of the condensation domain superfamily, which also contains other NRPS domains that catalyze diverse reactions (16, 17). The most well studied is the epimerization (E) domain, which catalyzes the conversion of an l-amino acid to a d-amino acid. Other tailoring domains found in NRPS enzymes are responsible for formylation, halogenation, methylation, oxidation, and reduction (14). These modifications all add to the structural diversity of nonribosomal peptides.
Several families of enzymes are responsible for the β-hydroxy amino acids found in nonribosomal peptides, namely Fe-heme monooxygenases (18), diiron monooxygenases (19), and nonheme Fe(II)/α-ketoglutarate–dependent dioxygenases (20). Only this last class of FeNH/αKG dioxygenases have been found to hydroxylate aspartic acid (20⇓–22). OrbG, encoded in the biosynthetic gene cluster of the siderophore ornibactin (SI Appendix, Fig. S1), was the first enzyme predicted to be an aspartyl β-hydroxylase based on homology to FeNH/αKG dioxygenases (23). All β-OHAsp–containing siderophores feature at least one homolog of orbG in their biosynthetic gene cluster encoding either a discrete enzyme, or a tailoring domain fused to the C terminus of a NRPS (e.g., serobactin; Fig. 1) (2). No siderophore β-hydroxylases of aspartate or histidine have been characterized, although they have high sequence similarity (47 to 76%) to SyrP, the only NRPS-associated aspartyl β-hydroxylase characterized in vitro (20). SyrP is involved in the biosynthesis of syringomycin (SI Appendix, Fig. S2), a phytotoxin produced by Pseudomonas syringae (20). Originally annotated as a regulatory protein, SyrP was found to hydroxylate not free Asp, but Asp tethered to a thiolation domain of the syringomycin NRPS SyrE (20).
Proposed nonribosomal biosynthesis of serobactin A. β-Hydroxylases (βH), which may be stand-alone enzymes (e.g., SbtH) or fused NRPS tailoring domains (e.g., orange domain on SbtI1), hydroxylate ʟ-Asp residues while they are tethered to the thiolation domain of the NRPS. Some β-hydroxylases are associated with an interface (I) domain (e.g., green domain on SbtI2), newly identified and described in the text. Adenylation (A) domains are labeled by the substrate they activate and incorporate. C, condensation domain; E, epimerization domain; T, thiolation domain; Te, thioesterase domain. The structures of the serobactins were reported in ref. 30.
Based on homology to SyrP and OrbG, the existence of a number of β-OHAsp residues in siderophores has been rationalized (7, 11, 24⇓⇓⇓–28) or predicted (29⇓⇓⇓–33). Recently, Kurth et al. (33) used cucF, a β-hydroxylase domain from the cupriachelin (Fig. 2) gene cluster, as a handle to scan genomes for photoactive Fe(III)-siderophores, leading to the discovery of variochelin (SI Appendix, Fig. S1). Genome mining for β-OHAsp–containing siderophores has been quite successful, but current techniques can leave ambiguity in the predicted structure. Although the taiwachelin (Fig. 2) NRPS was correctly predicted to load 2 Asp residues, predicting that only one would be hydroxylated by TaiD, the putative Asp β-hydroxylase, was not possible at the time (29). Some pyoverdines also retain an unmodified Asp (e.g., pyoverdine G4R; Fig. 2) (7, 27). In contrast, both Asp residues are hydroxylated in cupriachelin, serobactin, pacifibactin, and alterobactin (Figs. 1 and 2 and SI Appendix, Fig. S1) (25, 30, 34).
Representative siderophores with β-OHAsp residues. Some Asp residues are not hydroxylated in β-OHAsp–containing siderophores. Variable β-OHAsp stereochemistry has led to further ambiguity in genomic predictions. The l-erythro β-OHAsp diastereomer in cupriachelin is reported herein.
Further complicating structural predictions, β-OHAsp has 2 stereocenters (i.e., at the α- and β-carbons); thus, β-OHAsp potentially exists as any of 4 diastereomers. All stereochemically characterized siderophores were reported to contain either d-threo (2R, 3R) or l-threo (2S, 3S) β-OHAsp until 2018, when the l-erythro (2S, 3R) isomer was reported in the amphiphilic siderophore imaqobactin (Fig. 2) (35). The l- or d- configuration of the α-carbon can easily be predicted by the absence or presence of an E domain; however, no methods exist for predicting the stereochemistry at the β-carbon.
To develop refined genomic tools to predict the reactivity and stereoselectivity of β-hydroxylation in siderophores, we analyzed gene clusters responsible for the biosynthesis of structurally characterized siderophores containing β-OHAsp or β-OHHis. Functional subtypes of β-hydroxylases emerged, which we corroborate with phylogenetic analysis. These subtypes show clear patterns in genomic organization (stand-alone enzymes or integrated NPRS tailoring domains), amino acid substrate (Asp or His), and reactive NRPS partner. Significantly, the subtypes also exhibit divergent diastereoselectivity, enabling the prediction of the conformation at the C3 stereocenter. Stereochemical characterization of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin, reported herein, confirm the predictive power of subtype delineation. We also identify a previously undescribed member of the NRPS condensation domain superfamily, which is associated with certain β-hydroxylase subtypes. We name this class the interface (I) domain for a proposed role in positioning the β-hydroxylase and the NRPS-bound amino acid substrate prior to hydroxylation.
Results and Interpretation
Compilation and Organization of β-Hydroxylase Genes from Known Siderophores.
A comprehensive literature search revealed more than 35 β-OHAsp– and β-OHHis–containing siderophores with reported structures (SI Appendix, Tables S1 and S2), of which 26 were isolated from strains with published genomes (SI Appendix, Table S1). The associated biosynthetic gene clusters were extracted and annotated, resulting in a final dataset of 30 β-hydroxylases (Table 1). Two families of aspartyl β-hydroxylases emerged with distinct genetic organizations: 1) integrated tailoring domains fused to the core NRPS machinery, and 2) stand-alone enzymes encoded by discrete genes within the biosynthetic cluster (Table 1). The gene clusters for cupriachelin, serobactin, and pacifibactin each encode 2 putative aspartyl β-hydroxylases (e.g., SbtH and SbtI1 in serobactin; Fig. 1). All 3 hydroxylases putatively responsible for β-OHHis synthesis are found in the biosynthetic gene clusters as stand-alone genes. By coordinating the genomic organization of the β-hydroxylase with both the NRPS architecture and the final peptidic siderophore structure, contrasting reactivity was found, as summarized in Table 1 and elaborated below.
Putative amino acid β-hydroxylases found in siderophore biosynthesis clusters
Reactivity by the Aspartic Acid β-Hydroxylase NRPS Domain.
The aspartyl β-hydroxylase domains are found at the C terminus of a NRPS, directly following a C domain (Fig. 3). These tailoring domains hydroxylate Asp residues loaded by NRPS modules with the domain architecture C*-A-T. While this C* domain is homologous to other C domains, it lacks the typical condensation domain catalytic motif, HHxxxDG (17). The β-hydroxylase–associated C* domains form a distinct phylogenetic clade within the condensation domain superfamily (see below); we name this previously unreported subtype the “interface” (I) domain for a putative role in positioning the β-hydroxylase and NRPS-bound substrate for hydroxylation, as discussed below. Accordingly, we name the integrated aspartyl β-hydroxylase domain the “interface-associated” Asp β-hydroxylase (IβHAsp). Taiwachelin has a single IβHAsp domain in TaiD and 2 modules that load Asp; hydroxylation is only found to occur in the TaiE module containing an I domain (Fig. 3) (29).
Representative reactivity of amino acid β-hydroxylases in siderophore biosynthesis. Aspartyl β-hydroxylases may be interface-associated (IβHAsp) and interact only with Asp-loading modules with the architecture I-A-T, or TE-associated (TβHAsp) and interact only with modules containing a GGDSI motif in the thiolation domain. Siderophore histidyl β-hydroxylases are also interface-associated (IβHHis). The proposed reactivity partners explain the unmodified Asp residues in taiwachelin and pyoverdine GB-1. Genomic data were retrieved from RefSeq (SI Appendix, Table S1). For NRPS domain abbreviations, see the legend of Fig. 1.
Reactivity by the Stand-alone Amino Acid β-Hydroxylase Enzyme.
The second class of aspartyl β-hydroxylases, encoded by stand-alone genes within the biosynthetic cluster, generally acts on Asp loaded by C-A-T-E modules, which leads to d-β-OHAsp in the peptidic siderophore. A sequence-level analysis of these targeted C-A-T-E modules shows a strictly conserved GGDSI (TE) motif in the thiolation domain in place of the more common GGHSL (TC) motif. The TE motif is indicative of a T domain followed by an E domain (36). We therefore name this class of stand-alone enzymes the “TE-associated” aspartyl β-hydroxylase (TβHAsp) family. Several of the stand-alone TβHAsp enzymes hydroxylate Asp attached to modules that lack the E domain, thereby forming l-β-OHAsp; however, the TE domain is still present (Table 1). The pyoverdine GB-1 cluster encodes 1 TβHAsp enzyme (PputGB1_4087) and 2 Asp residues, loaded by C-A-TC and C-A-TE-E modules, of which only the latter is hydroxylated (Fig. 3) (27). On the other hand, the pyoverdine 1448a gene cluster encodes a single TβHAsp enzyme (Pspph_RS09710), which hydroxylates Asp bound to 2 different C-A-TE-E modules (Fig. 3) (24).
Histidyl β-hydroxylases form a third functional subtype of amino acid β-hydroxylases. Only 3 sequenced examples were found in reported siderophores, all from Pseudomonas species (Table 1). These His β-hydroxylases are stand-alone enzymes, and they appear alongside an interface domain; therefore, they are herein named IβHHis (interface-associated histidyl β-hydroxylase) enzymes. Histicorrugatin has 2 β-OHHis residues, both loaded by NRPS I-A-T modules, and the gene cluster only encodes a single IβHHis enzyme (HcsE; Fig. 3) (11).
All 3 siderophore amino acid β-hydroxylase subtypes—IβHAsp, TβHAsp, and IβHHis—functionally contrast a fourth subtype found in nonsiderophore peptides, which includes the aspartyl β-hydroxylase SyrP of syringomycin biosynthesis (SI Appendix, Fig. S2 and Table S3). Two other SyrP-like β-hydroxylase (SβHAsp) enzymes are encoded by orthologous phytotoxin biosynthetic gene clusters, i.e., ThaF and NupP of thanamycin and nunamycin biosyntheses, respectively (SI Appendix, Fig. S2 and Table S3) (37, 38). SyrP, ThaF, and NupP are stand-alone enzymes that hydroxylate Asp loaded by C-A-TC modules. They require neither an I domain nor the TE motif, and a driver of residue selectivity could not be determined.
Phylogeny and Sequence Analyses of Asp and His β-Hydroxylases.
To corroborate the existence of distinct IβHAsp, TβHAsp, and IβHHis amino acid β-hydroxylase functional subtypes, we reconstructed an unrooted phylogenetic tree of 30 known β-hydroxylase protein sequences from the 26 siderophore clusters (Fig. 4). Three members of the SyrP-like family (SβHAsp) were also included (SI Appendix, Table S3). All 4 subtypes formed distinct clades with the exception of IβHAsp, which is paraphyletic with respect to SβHAsp (Fig. 4).
Maximum-likelihood phylogenetic tree of 33 β-hydroxylases inferred from aligned amino acid sequences. The NRPS module that loads the residue to be hydroxylated is indicated next to each subtype. The 2 highlighted clades within the TβHAsp subtype do not act on E domain-containing modules. The phylogenetic tree was reconstructed in IQ-TREE (74) using the LG+F+R5 model, as chosen by ModelFinder (Akaike information criterion) (75). Branch support was assessed by bootstrapping (100 bootstrap replicates); support values are only shown for intersubtype branches. The scale bar indicates the average number of substitutions per site.
All of the siderophore amino acid β-hydroxylases belong to the TauD/TdfA family of nonheme Fe(II)/α-ketoglutarate–dependent dioxygenases (PFAM family: PF02668) (39). Crystallographic studies of taurine dioxygenase TauD, 2,4-dichlorophenoxyacetate monooxygenase TdfA, and related enzymes (40⇓⇓⇓–44) have elucidated a 2-His-1-carboxylate facial triad Fe(II) binding motif (45, 46). The α-ketoglutarate cofactor, which coordinates Fe(II), is bound by strictly conserved Thr and Arg residues (47). A multiple sequence alignment of the 33 collected β-hydroxylases across the 4 subtypes (IβHAsp, TβHAsp, IβHHis, and SβHAsp) reveals 31 residues (∼11% of the total domain length) that are absolutely conserved, including the expected combined Fe(II)/α-ketoglutarate binding motif of H-N-E-X24-T-X164–170-H-X8-R (SI Appendix, Fig. S3).
Phylogeny and Sequence Analyses of Interface Domains.
Interface domain amino acid sequences were collected from siderophore biosynthetic clusters, including 2 domains each from histicorrugatin (Fig. 3) and alterobactin (SI Appendix, Fig. S1), resulting in a dataset of 10 IβHAsp-associated and 4 IβHHis-associated interface domains (Fig. 5 and SI Appendix, Table S4). All 14 of the I domains belong to the PFAM condensation domain family PF00668 (39). Rausch et al. (16) previously delineated NRPS condensation domain subtypes and reconstructed their phylogenetic relationship. We aligned our extracted I domain sequences to their dataset and reconstructed a maximum-likelihood phylogenetic tree (Fig. 5 and Dataset S1). The I domains form a well-supported clade, justifying the assignment of a separate subtype. The multiple sequence alignment shows strikingly few conserved residues; in fact, only 9 residues are strictly conserved in the I domain family (SI Appendix, Fig. S4), less than 3% of the total domain length, and none are specific to the I domain. The canonical HHxxxDG active site is absent from I domains; only the Asp remains (Fig. 5). Crystallographic studies of the condensation domains of VibH and EntF (NRPSs of vibriobactin and enterobactin, respectively) have revealed that this Asp residue is not catalytic, but rather structural, forming a salt bridge with an Arg residue (SI Appendix, Fig. S4: residue 228), and the pair is strictly conserved throughout the condensation domain superfamily (48, 49).
Maximum-likelihood phylogenetic tree of the condensation domain superfamily and sequence logos of canonical condensation active sites. With the exception of the interface domains, amino acid sequences were originally collected and aligned by Rausch et al. (16). The full Newick-formatted tree with sequence names is available in Dataset S1. The phylogenetic tree was reconstructed in PhyML v3.0 (76) through https://www.phylogeny.fr/ (77). The LG+I+F+G4 model was chosen by ModelFinder (Akaike information criterion) (75). Branch support was assessed by bootstrapping (100 bootstrap replicates). The scale bar indicates the average number of substitutions per site. Inset box: the traditional HHxxxDG condensation domain active site is poorly conserved in interface (I) domains. The epimerization (E) domain retains the motif, while the X domain has a conserved but modified HRxxxDD motif. Sequence logos were made with WebLogo (73), trimming any position with >50% gaps.
Stereochemistry of β-OHAsp Residues in Siderophores.
Many of the β-OHAsp–containing siderophores in Table 1 have been stereochemically characterized. The 2S or 2R C2 stereochemistry (i.e., l- or d-) of each β-OHAsp residue is consistent with the absence or presence of an epimerization domain in the NRPS (50), while the C3 stereochemistry is set during hydroxylation (Fig. 6A). The products of 7 of 9 IβHAsp domains (with the exception of pyoverdine Pf0-1 and variobactin) have reported β-OHAsp stereochemistry, all 3S (29, 30, 33, 34, 51⇓–53). The target NRPS module of each IβHAsp domain lacks an E domain, thereby producing the l-threo (2S, 3S) configuration of β-OHAsp (Table 1 and Fig. 6A). The stereochemistry of 10 of 18 siderophores with TβHAsp-mediated hydroxylation has been reported in the literature (3, 25, 26, 30, 32, 34, 54⇓⇓–57). In contrast to the IβHAsp clade, most of the TβHAsp enzymes were reported to hydroxylate Asp to form the 3R product. The TβHAsp target modules generally contain an E domain, resulting in the d-threo (2R, 3R) configuration of β-OHAsp (with exceptions below).
Divergent stereochemical products of aspartyl β-hydroxylases. (A) NRPS-bound ʟ-Asp may be β-hydroxylated to give one of 2 diastereomers. ʟ-Threo (2S, 3S) β-OHAsp is produced by the IβHAsp and SβHAsp subtypes. ʟ-Erythro (2S, 3R) β-OHAsp is produced by the TβHAsp subtype and is converted to d-threo (2R, 3R) β-OHAsp if an epimerization domain is present. d-Erythro β-OHAsp has not been observed in any stereochemically characterized siderophore. Condensation to the upstream amino acid, which likely precedes epimerization (78), has been omitted for clarity. (B) Stereochemically characterized siderophores produced by members of the TβHAsp clade. The boxed subclades lack an epimerization domain, and the hydroxylation product remains as ʟ-erythro β-OHAsp. Stereochemistries in bold are reported herein. The phylogenetic tree was produced by midpoint rooting the phylogenetic tree from Fig. 4.
To clarify the reactivity of the TβHAsp family and test the subtype-derived model of β-OHAsp stereochemistry, we selected 3 representative siderophores from the β-hydroxylase phylogenetic tree in Fig. 4 and stereochemically characterized the β-OHAsp residues (Fig. 6B). The biosynthetic gene cluster of pyoverdine GB-1 encodes a TβHAsp enzyme, PputGB1_4087, that is paired with an E domain (Fig. 3). While β-OHAsp was previously confirmed in pyoverdine GB-1, its stereochemistry had not been reported (27). Stereochemical characterization of the pyoverdine from Pseudomonas putida GB-1 with Marfey’s reagent (1-fluoro-2,4-dinitrophenyl-5-l-alanine amide [FDAA]) (58) confirmed the sole presence of d-threo β-OHAsp (SI Appendix, Figs. S5–S7).
The hydroxylases from delftibactin, acidobactin, and vacidobactin biosyntheses (DelD, Aave_3734, and Vapar_3747, respectively) form a well-supported clade nested within the TβHAsp family (Figs. 4 and 6B and SI Appendix, Fig. S1). None of the hydroxylated products had been stereochemically characterized; however, each of these enzymes acts on modules with no E domain, suggesting that β-OHAsp must remain in the l-erythro configuration in these 3 siderophores. To test this prediction, the β-OHAsp residue in delftibactin from Delftia acidovorans DSM 39 was investigated and found to be l-erythro β-OHAsp (SI Appendix, Figs. S8–S10). Histicorrugatin (Fig. 3) and cupriachelin (Fig. 2) TβHAsp enzymes HcsC and CucE likewise form a clade with no E domain (Figs. 4 and 6B). Purified histicorrugatin from Pseudomonas thivervalensis DSM 13194 was also observed to contain only l-erythro β-OHAsp (SI Appendix, Figs. S11–S13). This result led us to reexamine the β-OHAsp stereochemistry in cupriachelin (25). The Marfey’s derivatized hydrolysate of purified cupriachelin from Cupriavidus necator H16 contained 2 β-OHAsp diastereomers, confirmed as l-erythro and l-threo (SI Appendix, Figs. S14–S16), in contrast to the previous report of solely ʟ-threo β-OHAsp (25). Hydrolysis by DCl (59) confirmed that neither diastereomer was the result of epimerization during hydrolysis (SI Appendix, Fig. S16). Thus, all TβHAsp enzymes selectively produce the 3R diastereomers l-erythro and d-threo β-OHAsp (Fig. 6B).
SyrP of syringomycin (SI Appendix, Fig. S2) biosynthesis produces only the l-threo isomer in vitro, in accordance with the biosynthetic gene cluster, which lacks an E domain, and the structure of syringomycin (20); thus the entire SβHAsp/IβHAsp clade likely produces l-threo β-OHAsp isomers (Fig. 6A). The β-OHHis residue of pyoverdine pf244 (SI Appendix, Fig. S1) was reported to be in the l-threo configuration (5), and we therefore predict that the IβHHis subtype is also l-threo selective. Corrugatin and ornicorrugatin (SI Appendix, Fig. S1), both reported in unsequenced pseudomonads, were likewise determined to contain l-threo β-OHHis, and no other β-OHHis isomer has been reported in siderophores to date (9, 10).
Discussion
In sum, the chemistry of siderophore aspartyl β-hydroxylation follows 2 divergent routes (Figs. 3 and 6 and Table 1). Asp β-hydroxylase domains fused to the C terminus of a NRPS enzyme selectively hydroxylate Asp bound to a NRPS module in the presence of an interface (I) domain, a member of the condensation domain superfamily newly described herein (Fig. 5). The interface-associated aspartyl β-hydroxylase (IβHAsp) domains exclusively produce the l-threo (2S, 3S) isomer of β-OHAsp (Table 1 and Fig. 6). In the second pathway, Asp β-hydroxylases encoded by discrete genes in the biosynthetic cluster selectively act on Asp bound to thiolation domains with the TE sequence motif (GGDSI), and are consequently named the TE-associated aspartyl β-hydroxylase (TβHAsp) enzymes. All of these stand-alone enzymes produce l-erythro (2S, 3R) or d-threo (2R, 3R) β-OHAsp, requiring opposite stereospecificity of β-hydroxylation relative to IβHAsp domains (Table 1 and Fig. 6). Remarkably, the 2 drivers of residue specificity we propose here (i.e., the interface domain and the TE-type thiolation domain; Fig. 3) parallel selectivity seen in the cytochrome P450 family of enzymes involved in glycopeptide antibiotic and skyllamycin biosyntheses (60). These similarities, as well as the stereochemistry of β-hydroxylation, bear further consideration.
Parallels to Glycopeptide Antibiotic Biosynthesis.
Glycopeptide antibiotic (GPA) biosynthesis provides an example of a specialized NRPS domain responsible for positioning a stand-alone hydroxylase. After NRPS-based assembly, the GPA peptide precursor, still attached to the NRPS, is extensively cross-linked by several cytochrome P450 (Oxy) enzymes (61⇓–63). The final module of the NRPS Tcp12 contains a member of the C domain superfamily called the X domain, which is missing the HHxxxDG catalytic motif (Fig. 5) (16). Haslinger et al. (64) found through X-ray crystallography that the X domain is required to recruit the Oxy enzymes to the NRPS-bound peptide through protein–protein interaction. We propose the I domain function resembles that of the X domain, although this similarity is likely superficial, as the 2 subfamilies are distantly related phylogenetically (Fig. 5) and interact with entirely different protein families (i.e., nonheme Fe(II) dioxygenases and Fe-heme monooxygenases, respectively). Additionally, the X domain has a conserved, albeit modified, HRxxxD[DE] motif that blocks the traditional condensation active site (64), in contrast to the poorly conserved active site of the I domain (Fig. 5).
Parallels to Skyllamycin Biosynthesis.
The TE motif GGDSI is generally found when an epimerization (E) domain follows a T domain, and contrasts the GGHSL (TC) core motif usually found in T domains (36). Mutational studies show that the TE motif is required for proper interaction between the T and E domains (36), suggesting that TβHAsp enzymes may also require the GGDSI motif to properly interact with the T-bound amino acid. Cytochrome P450sky of skyllamycin (SI Appendix, Fig. S2) biosynthesis similarly selects residues for β-hydroxylation by interacting only with specific T domains, as determined by X-ray crystallography (65). In contrast to the TE domain, even noninteracting T domains within the skyllamycin pathway contain the key residues for P450sky interaction, and selectivity is believed to arise from minor changes in T domain tertiary structure that disrupt the NRPS/P450sky interface (60, 65). TβHAsp enzymes may similarly select modules for binding based on subtle structural changes, undetectable in the primary sequence of the NRPS. Regardless, the TE motif serves as a useful predictor of selectivity in the TβHAsp family.
Stereochemistry.
β-Hydroxylation in siderophore biosynthesis is expected to be under strict stereochemical control. Many siderophore β-OHAsp residues have been stereochemically characterized (Table 1); all were reported as the l-threo (2S, 3S) or d-threo (2R, 3R) isomers until the recent report of l-erythro (2S, 3R) β-OHAsp in imaqobactin (Fig. 2), produced by an unsequenced isolate (35). Both l-erythro and d-threo β-OHAsp share the 3R configuration; therefore, d-threo could arise from the epimerization of l-erythro (2S, 3R) β-OHAsp (Fig. 6). The stereochemical determination of β-OHAsp in histicorrugatin (Fig. 3) and delftibactin (SI Appendix, Fig. S1), as well as the stereochemical reassignment of one β-OHAsp residue in cupriachelin (Fig. 2), supports this mechanism. Each of these siderophore biosynthetic gene clusters encodes a TβHAsp enzyme that acts on a C-A-TE module lacking an E domain, and each siderophore contains the l-erythro isomer. These results do not necessarily preclude the scenario where epimerization precedes hydroxylation; however, they do show that epimerization is not required for TβHAsp-mediated β-hydroxylation, and that the α-carbon configuration does not control the stereochemistry of β-hydroxylation. Stereochemical control can instead be attributed to—and be predicted by—the functional subtypes identified herein (Fig. 6).
The divergent stereochemistry of the Asp β-hydroxylases is consistent with the reactivity of other Fe(II)/α-ketoglutarate–dependent β-hydroxylases. Kutzneride (SI Appendix, Fig. S2) biosynthesis involves 2 Fe(II)/αKG glutamyl β-hydroxylases, KtzO and KtzP, both of which have been characterized in vitro (66). KtzO stereospecifically reacts with NRPS-bound l-Glu to produce l-threo (2S, 3R) β-OHGlu, which is then epimerized to the d-erythro (2R, 3R) isomer (66); KtzP has the opposite selectivity, first producing the l-erythro (2S, 3S) isomer before epimerization to d-threo (2R, 3S) β-OHGlu (66). The FeNH/αKG-dependent amino acid β-hydroxylation mechanism involves β-hydrogen abstraction by a reactive Fe(IV)-oxo species, followed by rebound hydroxylation to form the alcohol (2, 67). To achieve such stereospecificity, the amino acid substrate must be oriented in the active site with only one β-hydrogen (prothreo or proerythro) oriented toward the active site. A comparison of the product-bound crystal structures of AsnO, a threo-selective l-Asn β-hydroxylase, and VioC, an erythro-selective l-Arg β-hydroxylase, revealed that the 2 homologs hold their substrates in different rotational conformations: In AsnO, the side chain of l-Asn is held trans, pointed toward the center of the enzyme, while VioC holds l-Arg in a more strained gauche(–) conformation, pointing the side chain toward the enzyme surface (43, 44). Similarly, we suspect that the contrasting β-OHAsp stereochemistry is caused by a difference in l-Asp positioning.
The reactivity of an undescribed siderophore aspartyl β-hydroxylase can now be predicted based solely on whether the β-hydroxylase is fused to the NRPS machinery (IβHAsp) or a free-standing enzyme (TβHAsp). The 2 drivers of residue specificity we propose here (i.e., the interface domain and the TE-type thiolation domain) then allow for quick prediction of hydroxylation sites. A more rigorous subtype determination can be made using profile hidden Markov models (pHMMs), probabilistic representations of the amino acid sequences. The pHMMs for each subtype (Dataset S2) may be incorporated into domain and structure prediction workflows with HMMER3 (68). A sequence that poorly matches each of the 4 pHMMs (bitscore < 400) may belong to a new functional subtype distinct from the subtypes described herein. For example, glutamyl β-hydroxylases KtzO and KtzP of kutzneride (SI Appendix, Fig. S2) biosynthesis (66) best match the IβHHis pHMM with bitscores of 340 and 224, respectively, well below the cutoff.
Conclusion
Genome mining is streamlining the discovery of new specialized metabolites but can leave ambiguities in the predicted structures that must be rectified experimentally. Comparing siderophore biosynthetic gene clusters to verified structures reveals the origin of the β-OHAsp diastereomers in siderophores, providing both predictive tools and avenues for future research. We have identified 3 functional subtypes of β-hydroxylases involved in siderophore biosynthesis (i.e., IβHAsp, TβHAsp, and IβHHis), placing their reactivity in contrast to SyrP (20) and SyrP-like enzymes (SβHAsp). These newly delineated subtypes, validated by phylogenetic reconstruction (Fig. 4), show clear patterns in genomic organization (stand-alone enzymes or integrated NPRS tailoring domains), amino acid substrate (Asp or His), reactive NRPS partner (IAT, CATE, or CATC), and stereochemistry (l-threo, l-erythro, or d-threo) (Table 1).
A hallmark of NRPS-based biosynthesis is the presence of d-amino acids, which are coincident with epimerization domains. With 2 stereocenters, β-OHAsp can exist as any of 4 diastereomers. Through mapping stereochemically characterized β-OHAsp residues in siderophores to the phylogenetic tree of β-hydroxylases, we have developed a method to predict β-OHAsp stereochemistry in silico. We further tested and confirmed our predictive methods with the stereochemical determination of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin (Fig. 6B). While IβHAsp domains consistently produce l-threo (2S, 3S) β-OHAsp, the TβHAsp subtype produces the l-erythro (2S, 3R) isomer, which is often epimerized to d-threo β-OHAsp (Fig. 6A) (2R, 3R). The d-erythro (2R, 3S) stereoisomer has not been identified in any siderophore but would be consistent with an IβHAsp domain paired with epimerization by an I-A-T-E NRPS architecture. The contrasting pro-R versus pro-S β-hydroxylation that we identified may have arisen from a reconfiguration of the active site, holding the aspartyl substrate in a different orientation. Future work will elucidate the structural basis of Asp β-hydroxylase stereospecificity.
Materials and Methods
Collection of Known Siderophore Biosynthetic Gene Clusters.
Genomes of bacterial strains with reported β-OHAsp– and β-OHHis–containing siderophores were downloaded as assemblies from National Center for Biotechnology Information RefSeq (SI Appendix, Table S1) (69). The collection covers siderophore structures published through June 2019. Siderophores produced by unsequenced strains (SI Appendix, Table S2) were excluded from this analysis, with the exception of alterobactins (SI Appendix, Fig. S1). Originally isolated from an unsequenced marine isolate of Pseudoalteromonas luteoviolacea (51, 70), a putative catechol/β-OHAsp gene cluster consistent with alterobactin is present in all sequenced P. luteoviolacea strains. NRPS domain organization was determined by comparing the amino acid sequences to a database of common NRPS domain HMMs using hmmscan [HMMER3 (68)] and confirmed by comparison to the linear structure of the siderophore.
Protein Sequence Manipulation.
In each genome of interest, putative β-hydroxylase domains were identified using hmmsearch [HMMER3 (68)] to find matches to the Pfam TauD family PF02668 (39). The amino acid sequences of these domains were excised from the proteins by trimming to the resulting hmmsearch envelope range. Sequences then were aligned using MUSCLE (71). Interface (I) domains were too poorly conserved to be trimmed by hmmsearch with the PFAM condensation domain family PF00668 (39). Instead, parent NRPS protein sequences were truncated to the 500 N-terminal amino acids and aligned using MUSCLE. Using the profile-profile alignment function from MUSCLE, these sequences were aligned to the condensation domain alignment created by Rausch et al. (16), and trimmed to length using SeaView (72). The resulting excised I domains were then realigned with MUSCLE. To visualize the multiple sequence alignments, sequence logos were created using WebLogo (73), trimming any position with >50% gaps. pHMMs were created for each of the β-hydroxylase subtypes. Representative protein sequences were collected and aligned with MUSCLE (71), and then HMMs were constructed from these multiple sequence alignments using the hmmbuild function from HMMER3 (68).
Phylogenetic Analyses.
Phylogenetic trees of β-hydroxylases and of the condensation domain superfamily were reconstructed. Multiple sequence alignments of protein sequences were prepared with MUSCLE (71) and trimmed with SeaView (72). The maximum-likelihood phylogenetic tree was reconstructed in IQ-TREE 1.6.7 (74), using the best-fit model of protein evolution for each alignment as chosen by ModelFinder (Akaike information criterion) (75). Branch support was assessed by bootstrapping (100 bootstrap replicates). Phylogenies were visualized with FigTree (http://tree.bio.ed.ac.uk/software/figtree/). To place the interface domain in the condensation domain superfamily, we recreated the unrooted phylogeny from Rausch et al., figure 4 (16). Using the profile-profile align function from MUSCLE, we aligned our interface domains to the 203 taxa used by Rausch et al. (16) and trimmed the interface domain sequences to the existing alignment before IQ-TREE reconstruction.
Strains and Growth Conditions.
Delftia acidovorans DSM 39 and Pseudomonas thivervalensis DSM 13194 were obtained from the German Collection of Microorganisms and Cell Cultures (Deutsche Sammlung von Mikroorganismen und Zellkulturen), Pseudomonas putida GB-1 was obtained from B. Tebo (Oregon Health and Science University, Portland, OR), and Cupriavidus necator H16 was obtained from S. Parsons (University of California, Santa Barbara, CA). Each was maintained on LB agar plates. For siderophore isolation, D. acidovorans was cultured in 1 L of acidovorax complex medium (consisting of 0.5 g⋅L−1 chelex-treated yeast extract, 1.0 g⋅L−1 chelex-treated casamino acids, 2.0 g⋅L−1 succinic acid, 2.0 g⋅L−1 l-glutamic acid, 0.3 g⋅L−1 KH2PO4, and 2.0 g⋅L−1 MOPS buffer, pH adjusted to 7.2) for 48 h at 30 °C, shaken at 160 rpm. P. thivervalensis was cultured in 1 L of casamino acids minimal medium (consisting of 5 g⋅L−1 chelex-treated casamino acids, 1.18 g⋅L−1 K2HPO4, and 0.25 g⋅L−1 MgSO4·7H2O) for 144 h at 30 °C, shaken at 160 rpm. P. putida and C. necator were each cultured in 1 L of casamino acids minimal medium for 67 h at 30 °C, shaken at 160 rpm.
Siderophore Isolation.
Each culture was pelleted by centrifugation (6,000 rpm, 30 min). The resultant supernatant was decanted into a clean 1-L flask, to which 100 g of water-washed Amberlite XAD-4 resin was added. The supernatant was shaken with the resin for 3 to 4 h at 4 °C, 150 rpm. The resin was then filtered from the supernatant and eluted with 75% methanol in ultrapure water (P. putida) or 90% methanol in ultrapure water (D. acidovorans, P. thivervalensis, and C. necator). The eluent was concentrated in vacuo and analyzed by ultraperformance liquid chromatography (UPLC)–electrospray ionization mass spectrometry for the presence of siderophore. To obtain pure siderophore, the concentrated eluent was separated by semipreparative reverse-phase high-performance liquid chromatography (RP-HPLC) (250 × 20-mm YMC C18-AQ column, 7 mL/min flow rate), employing a gradient of methanol in ultrapure water (+0.05% trifluoroacetic acid): 10 to 35% MeOH over 25 min (delftibactin), 10 to 40% MeOH over 30 min (histicorrugatin), 5 to 30% MeOH over 25 min (pyoverdine GB-1), or 40 to 80% MeOH over 40 min (cupriachelin).
Amino Acid Analysis.
For each siderophore, ∼1 mg was dissolved in 200 µL of ultrapure water. To the siderophore solution was added either 200 µL of 12 M HCl, 200 µL of 20% DCl in D2O, or 200 µL of 55% HI. Each acidified solution was then transferred to a glass ampoule, blanketed with Ar, and sealed. For HCl hydrolyses, ampoules were heated for 4 h at 100 °C. For DCl hydrolyses, ampoules were heated for 6 h at 100 °C. For HI hydrolyses, ampoules were heated for 22 h at 100 °C. After heating, ampoules were opened, and crude hydrolysates were transferred to microcentrifuge tubes. Hydrolysates were evaporated and redissolved in ∼700 µL of ultrapure water 3 times to remove any acid, and then brought to a final volume of 100 µL. Hydrolysates were reacted with FDAA (Marfey’s reagent) following standard conditions (58).
Derivatized hydrolysates were analyzed by RP-HPLC monitoring at 340 nm on a 250 × 4.6-mm YMC C18-AQ column, employing gradient elutions of either 15 to 50% acetonitrile (+0.05 trifluoroacetic acid) in ultrapure water (+0.05 trifluoroacetic acid) over 50 min, 1 mL/min flow rate (delftibactin, pyoverdine GB-1 hydrolysates); or 15 to 50% acetonitrile (no additives) in 50 mM triethylamine phosphate (pH 3.0) over 50 min, 1 mL/min flow rate (histicorrugatin hydrolysate). Amino acid identity was confirmed by positive ion mode electrospray ionization mass spectrometry analysis on a Waters Xevo G2-XS QTof coupled to an ACQUITY UPLC H-Class system. A Waters BEH C18 column was used with a linear gradient of 15 to 50% CH3CN (0.1% formic acid) in ddH2O (0.1% formic acid) over 10 min (delftibactin, histicorrugatin hydrolysates) or a linear gradient of 10 to 30% CH3CN (0.1% formic acid) in ddH2O (0.1% formic acid) over 10 min (pyoverdine GB-1, cupriachelin hydrolysates).
Acknowledgments
Support from NSF Grant CHE-171076 is gratefully acknowledged. This work made use of the Materials Research Laboratory Shared Experimental Facilities supported by the Materials Research Science and Engineering Centers Program of the NSF (Grant DMR-1121053). We thank B. Tebo and S. Parsons for providing bacterial strains, and R. Behrens for mass spectrometry support.
Footnotes
- ↵1To whom correspondence may be addressed. Email: butler{at}chem.ucsb.edu.
Author contributions: Z.L.R., C.D.H., and A.B. designed research; Z.L.R., C.D.H., J.S., and J.B. performed research; Z.L.R., C.D.H., J.S., J.B., and A.B. analyzed data; and Z.L.R. and A.B. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission. J.M.B. is a guest editor invited by the Editorial Board.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1903161116/-/DCSupplemental.
Published under the PNAS license.
References
- ↵
- ↵
- C. D. Hardy,
- A. Butler
- ↵
- ↵
- ↵
- D. K. Hancock et al
- ↵
- H. Budzikiewicz,
- S. Kilz,
- K. Taraz,
- J. M. Meyer
- ↵
- ↵
- W.-J. Chen et al
- ↵
- D. Risse,
- H. Beiderbeck,
- K. Taraz,
- H. Budzikiewicz,
- D. Gustine
- ↵
- S. Matthijs,
- H. Budzikiewicz,
- M. Schäfer,
- B. Wathelet,
- P. Cornelis
- ↵
- S. Matthijs et al
- ↵
- ↵
- K. Scholz,
- T. Tiso,
- L. M. Blank,
- H. Hayen
- ↵
- ↵
- ↵
- ↵
- K. Bloudoff,
- T. M. Schmeing
- ↵
- ↵
- T. M. Makris,
- M. Chakrabarti,
- E. Münck,
- J. D. Lipscomb
- ↵
- ↵
- ↵
- L. An et al
- ↵
- K. Agnoli,
- C. A. Lowe,
- K. L. Farmer,
- S. I. Husnain,
- M. S. Thomas
- ↵
- ↵
- ↵
- J. Franke,
- K. Ishida,
- M. Ishida-Ito,
- C. Hertweck
- ↵
- D. L. Parker et al
- ↵
- M. P. Kem,
- H. Naka,
- A. Iinishi,
- M. G. Haygood,
- A. Butler
- ↵
- ↵
- ↵
- ↵
- M. J. Vargas-Straube et al
- ↵
- C. Kurth,
- S. Schieferdecker,
- K. Athanasopoulou,
- I. Seccareccia,
- M. Nett
- ↵
- C. D. Hardy,
- A. Butler
- ↵
- A. W. Robertson et al
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- I. J. Clifton et al
- ↵
- I. Müller,
- C. Stückl,
- J. Wakeley,
- M. Kertesz,
- I. Usón
- ↵
- ↵
- ↵
- ↵
- D. A. Hogan,
- S. R. Smith,
- E. A. Saari,
- J. McCracken,
- R. P. Hausinger
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- P. Demange,
- A. Bateman,
- J. K. Macleod,
- A. Dell,
- M. A. Abdallah
- ↵
- ↵
- ↵
- J. S. Martinez et al
- ↵
- ↵
- R. A. Atkinson,
- A. L. Salah El Din,
- B. Kieffer,
- J. F. Lefèvre,
- M. A. Abdallah
- ↵
- ↵
- R. Hermenau et al
- ↵
- M. Peschke,
- M. Gonsior,
- R. D. Süssmuth,
- M. J. Cryle
- ↵
- ↵
- ↵
- M. Peschke,
- C. Brieke,
- M. Heimes,
- M. J. Cryle
- ↵
- ↵
- ↵
- ↵
- A. J. Mitchell et al
- ↵
- ↵
- ↵
- R. T. Reid,
- A. Butler
- ↵
- ↵
- ↵
- G. E. Crooks,
- G. Hon,
- J.-M. Chandonia,
- S. E. Brenner
- ↵
- ↵
- ↵
- ↵
- ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Physical Sciences
- Chemistry
- Biological Sciences
- Biochemistry