New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Structural and mutational analysis of the nonribosomal peptide synthetase heterocyclization domain provides insight into catalysis
Edited by Peter B. Moore, Yale University, New Haven, CT, and approved November 30, 2016 (received for review August 31, 2016)

Significance
Nonribosomal peptide synthetases produce peptides with wide varieties of therapeutic and biological activities. Monomer substrates are typically linked by a condensation domain. However, in many modules, a heterocyclization (Cy) domain takes its place and performs both condensation and cyclodehydration of a cysteine, serine, or threonine to form a five-membered ring in the peptide backbone. Although studied for decades, the mechanisms of condensation and cyclodehydration by Cy domains were previously unknown. The crystal structure of a Cy domain, and accompanying mutagenic and bioinformatics analyses, uncover the importance of an aspartate and a threonine for the cyclodehydration reaction. This study provides insight into the catalysis of condensation by the Cy domain and enables the proposal of a reaction mechanism for cyclodehydration.
Abstract
Nonribosomal peptide synthetases (NRPSs) are a family of multidomain, multimodule enzymes that synthesize structurally and functionally diverse peptides, many of which are of great therapeutic or commercial value. The central chemical step of peptide synthesis is amide bond formation, which is typically catalyzed by the condensation (C) domain. In many NRPS modules, the C domain is replaced by the heterocyclization (Cy) domain, a homologous domain that performs two consecutive reactions by using hitherto unknown catalytic mechanisms. It first catalyzes amide bond formation, and then the intramolecular cyclodehydration between a Cys, Ser, or Thr side chain and the backbone carbonyl carbon to form a thiazoline, oxazoline, or methyloxazoline ring. The rings are important for the form and function of the peptide product. We present the crystal structure of an NRPS Cy domain, Cy2 of bacillamide synthetase, at a resolution of 2.3 Å. Despite sharing the same fold, the active sites of C and Cy domains have important differences. The structure allowed us to probe the roles of active-site residues by using mutational analyses in a peptide synthesis assay with intact bacillamide synthetase. The drastically different effects of these mutants, interpreted by using our structural and bioinformatic results, provide insight into the catalytic mechanisms of the Cy domain and implicate a previously unexamined Asp-Thr dyad in catalysis of the cyclodehydration reaction.
- nonribosomal peptide synthetase
- heterocyclization domain
- X-ray crystallography
- bacillamide
- mutational analysis
Nonribosomal peptide synthetases (NRPSs) are a family of large, multimodular enzymes that produce a range of important bioactive secondary metabolites (1). NRPS products have great diversity because they can use more than 500 different acyl monomer substrates, including l-amino acids, d-amino acids, aryl acids, fatty acids, hydroxy acids, and keto acids, and they can subsequently modify these moieties during peptide synthesis.
NRPSs function in a modular, assembly-line fashion. A typical elongation module consists of a condensation (C), an adenylation (A), and a peptidyl-carrier protein (PCP) domain. The A domain specifically recognizes and activates a monomer acyl substrate through adenylation, then transfers it onto the prosthetic phosphopantetheine arm of the PCP domain. This acyl-PCP then travels to the C domain acceptor site for condensation with the upstream module’s acyl-PCP at the C domain donor site (2⇓⇓–5). The PCP domain brings the elongated peptide chain to the downstream module, where it is passed off and further elongated in the next condensation reaction. This process is repeated in each module until the growing peptide reaches the termination module, where it is elongated and then released from the NRPS, often by a thioesterase domain. However, most NRPSs, along with their C, A, and PCP domains, also include tailoring and/or alternative domains, which cosynthetically modify the nonribosomal peptide.
One important modification that can occur during peptide synthesis is cyclodehydration of Cys, Ser, or Thr residues into thiazoline, oxazoline, or methyloxazoline rings, respectively, by the NRPS heterocyclization (Cy) domain (6⇓⇓⇓⇓⇓–12). These heterocyclic rings are found in many peptides with important clinical and research utility, such as the antibiotics bacitracin A (6) and zelkovamycin (13), the antitumor agents bleomycin (14) and epothilone (8), the immunosuppressant argyrin (15), the siderophores mycobactin (16) and yersiniabactin (17), and the microbiome genotoxin colibactin (18). Introduction of the five-membered heterocyclic ring makes the peptide resistant to proteolytic cleavage and can induce conformations in the peptide that favor interaction with biological targets (19).
In NRPSs that synthesize these heterocycle-containing products, the module specific for the Cys, Ser, or Thr monomer substrate contains a Cy domain in place of the C domain. Cy domains are evolutionarily and structurally related to C domains (20). The Cy domain first catalyzes nucleophilic attack on the thioester of a PCP-linked donor substrate by the α-amino group of a Cys-, Ser-, or Thr-PCP substrate, presumably in a manner similar to C domains (6, 7, 10⇓–12, 21⇓–23) (Fig. 1). In the two-step cyclodehydration reaction that follows, the thiol of the Cys side chain or hydroxyl of the Ser or Thr side chain first attacks the carbonyl carbon of the newly formed amide to create the heterocycle (10, 11), and then the former carbonyl oxygen is removed in a dehydration reaction, which introduces the carbon-nitrogen double bond of the thiazoline or (methyl)oxazoline ring. The nascent heterocyclic peptidyl-PCP can be used as the donor substrate by the next module’s C domain, or is first oxidized or reduced by discrete oxidase or reductase domains (10, 24).
Schematic representation of BmdB and the bacillamide E biosynthesis cycle. The A domains (orange) adenylate alanine and cysteine and transfer them onto the phosphopantetheine arm of the PCP domains (blue). The Cy2 domain (dark green) first catalyzes amide bond formation between the PCP-linked Ala and Cys residues, then catalyzes the intramolecular cyclodehydration reaction. The C3 domain (light green) catalyzes amide bond formation between the PCP-linked Ala-thiazoline moiety and free tryptamine, which releases the bacillamide E from the NRPS.
All C domain superfamily domains share the same protein fold, so the overall form of the Cy domain is not in doubt (25, 26). However, the features that allow the Cy domain to catalyze two separate and different reactions are not known. Cy domains contain a conserved Asp motif, DxxxxD, which directly replaces the catalytic His motif, HHxxxD, of C domains (6). Mutation of the aspartate residues of the Asp motif diminishes or abolishes catalytic activity of the protein (12, 23, 27, 28). Furthermore, other mutations have differential effects on the condensation reaction and the cyclodehydration reaction, suggesting that the reactions are not catalyzed by a completely overlapping set of residues (10, 22). Because Cy domains are not larger than C domains, the added function in Cy domains must occur in a confined sequence space.
Here, we present the crystal structure of Cy2 of bacillamide synthetase at 2.3 Å resolution. Bacillamide synthetase (Fig. 1) is a trimodular NRPS that produces bacillamide E, whose derivatives bacillamide A–D exhibit algicidal activity against dinoflagellates, raphidophytes, and cyanobacteria (SI Appendix, Fig. S1) (29, 30). Structural determination, along with mutagenic analysis of this Cy domain active site in the context of the full, intact, and active bacillamide synthetase, and bioinformatic investigation allowed us to identify D1226 and T1196 as important residues for cyclodehydration, providing a better understanding of the structure and mechanism of the Cy domain.
Results and Discussion
The Crystal Structure of an NRPS Heterocyclization Domain.
We have solved the crystal structure of an NRPS heterocyclization domain by X-ray crystallography to a resolution of 2.3 Å. Like all known C domain superfamily proteins, BmdB-Cy2 adopts two chloramphenicol acetyltransferase folds (2), with the N- and C-terminal lobes forming a pseudodimer (Fig. 2). In C domains, these two lobes assume a range of relative conformations (more “open” or “closed”; ref. 5). Superimposition of BmdB-Cy2 with each C domain shows that it fits within the observed range and is quite similar to AB3403 (31) (SI Appendix, Fig. S2). The latch formed by a crossover between N- and C-terminal subdomains in C domains is present, meaning that the Cy domain active site is also situated in the center of a tunnel, between donor and acceptor PCP binding sites (see SI Appendix, SI Results and Discussion for more detailed description of the overall Cy domain structure).
The crystal structure of BmdB-Cy2. (A) Ribbon representation of BmdB-Cy2. The cyclization domain adopts two chloramphenicol acetyltransferase folds, similar to C domains. (B) Close-up view of the BmdB-Cy2 DxxxxD motif (orange). D980 makes a bifurcated hydrogen bond with the hydroxyl of S873 and a hydrogen bond with the backbone nitrogen of L982. D985 makes hydrogen bonds with S988 and the amides of A987 and F1134, and water-mediated interaction with R1120.
BmdB-Cy2 Active Site.
The main established signature that differentiates Cy domains from C domains is an active site motif: C domains have a catalytic His motif of HHxxxD, and Cy domains have an Asp motif of DxxxxD (6). The first aspartate of the Asp motif is essential for catalytic activity in only some Cy domains. When the first aspartate is mutated to alanine in HMWP2-Cy1, AngN-Cy1, and AngN-Cy2, condensation and cyclodehydration are completely eliminated (23, 27), but in VibF-Cy1, VibF-Cy2, and EpoB-Cy1, both condensation and cyclodehydration occur, although they are significantly diminished (12, 23, 28) (SI Appendix, Table S1 lists current and previous Cy domain mutations). The second aspartate is critical for activity (12, 23, 27, 28), but whether its role is catalytic or structural (like the aspartate at that position in C domains) (32) was not definitively determined. Furthermore, a triple mutant of the Asp motif of HMWP2-Cy1 to introduce a C domain His motif results in a catalytically inactive protein (23), demonstrating that a straight swap of the motifs is not sufficient to interconvert catalytic activities.
In the structure of BmdB-Cy2, both aspartate residues in DxxxxD (D980 and D985) are oriented away from the active site (Fig. 2B). The side chain of D980 makes a bifurcated hydrogen bond with the hydroxyl of S873 and a hydrogen bond with the backbone amide of L982. D985 hydrogen bonds with the sidechain of S988 and the amides of A987 and F1134. The interaction this aspartate makes with a nearby arginine in C domains is present, but is water-mediated for D985-R1120 of BmdB-Cy2. Thus, these aspartate residues directly replace the first histidine and the aspartate in the C domain HHxxxD motif, occupying the same position in their respective domains and making the same or similar interactions with the rest of the protein (SI Appendix, Fig. S2B). Indeed, the whole DxxxxD motif is essentially in the same conformation as the HHxxxD motif. The position of the second histidine of the HHxxxD motif, which is the most important residue for condensation in C domains (2⇓⇓–5), is occupied by A981 in BmdB-Cy2 and, thus, unable to contribute to catalysis. The HHxxxD motif in C domains does not reorient upon substrate binding (5), and there is no indication that the Cy domain DxxxxD motif would do so. Overall, the structure strongly suggests that both of these aspartate residues play structural rather than catalytic roles, as has been established for the corresponding residues in C domains (2⇓⇓–5).
Bacillamide Synthesis Assay and Mutational Analysis.
To determine the importance of other active site residues in condensation and cyclization, we established a bacillamide synthesis assay. Bacillamide synthetase is a one-protein, six-domain, 265-kDa NRPS (Fig. 1) (33). It adenylates and thiolates its Ala and Cys substrates, after which the Cy2 domain performs condensation and cyclodehydration. The final domain in this NRPS is not a thioesterase domain, but a specialized C domain that condenses the thiazoline-containing intermediate with free tryptamine (Tpm) to release bacillamide E. Therefore, this terminal C domain plays all of the roles typically associated with a four-domain termination module—substrate selection, peptide bond formation, and release of the peptide product—which is rare, but not unprecedented in NRPSs (12, 27).
The Tpm in the product makes the reaction convenient to follow by HPLC at 280 nm (Fig. 3 and SI Appendix, Fig. S3). Wild-type bacillamide synthetase showed a large peak at 15 min that corresponded by high resolution mass spectrometry, fragmentation mass spectrometry, and NMR to bacillamide E (1) (SI Appendix, Fig. S3C). A minor peak at ∼15.5 min, with 10% of the intensity of the main peak, corresponded to the uncyclized tripeptide Ala-Cys-Tpm (2). This product shows that, at least in vitro, BmdB-Cy2 is not completely efficient at cyclizing all of the intermediates it condenses, and that the terminal C3 domain is not completely selective for the cyclized substrate. Furthermore, a small peak for the dipeptide Cys-Tpm is also evident (3; SI Appendix, Fig. S3), indicating that the terminal C3 domain also uses Cys-PCP2 as a donor substrate to some extent. Note that bacillamide E contains a thiazoline ring, and not a thiazole ring as in bacillamide A–D extracted from natural sources (29, 34, 35). The adjacent oxidase BmdC likely performs the oxidation postsynthetically.
Effects of structure-guided mutations in BmdB-Cy2 on bacillamide E synthesis. (A) Representative HPLC trace of a BmdB activity assay. Compound 1 is bacillamide E, compound 2 is linear Ala-Cys-Tpm, and compound 3 is Cys-Tpm. (B) Representative mass spectra of 1 (blue) and 2 (red). A representative mass spectrum of 3 is shown in SI Appendix, Fig. S3A. (C) Quantification of the relative production of 1 (blue) and 2 (red) in reactions with mutant BmdB. All reactions were measured in triplicate, except F1118G, which was measured in duplicate. Error bars represent SD.
We next produced a series of bacillamide synthetase constructs with mutations in the Cy domain. We targeted residues in spatial proximity to the DxxxxD motif in the active site tunnel of BmdB-Cy2, with side chains that could play an important role in condensation or cyclization, and which are largely conserved in alignments of characterized Cy domains (SI Appendix, Fig. S4). Therefore, bacillamide synthetase with mutations Y859F, T1116A, F1118G, T1135A, T1196A, D1226G, and D1226N, as well as N1114A and S1197A (two residues shown previously to selectively effect cyclodehydration; ref. 10) were purified and assayed.
The bacillamide synthetase mutants had radically different effects on condensation and cyclodehydration (Fig. 3 and SI Appendix, Fig. S3 and SI Results and Discussion). Mutant S1197A had almost no effect on production of linear or heterocyclized tripeptide. With Y859F, bacillamide E production decreased to 79% of wild type, but linear Ala-Cys-Tpm production doubled. Although the hydroxyl of Y859 points directly into the heart of the active site, the relatively minor effect of its removal suggests it is unlikely to act chemically. Mutants N1114A and T1116A each drastically reduced synthesis of both the linear and heterocyclic product. T1116A maintained approximately the same ratio of heterocyclic to linear product as wild type, whereas the little product made by N1114A was successfully heterocyclized, consistent with previously reported decoupling of condensation and heterocyclization by this residue (10). Both N1114 and T1116 thus appear important for condensation, and N1114 is also important for cyclodehydration.
For insight into the cyclization reaction, the most interesting mutations are those that selectively affect cyclodehydration. Mutation of residues F1118, T1135, T1196, and D1226 resulted in substantially more linear Ala-Cys-Tpm and less bacillamide E than the wild-type enzyme. F1118 is positioned along the tunnel that the phosphopantetheine arm of donor PCP1 occupies when presenting the Ala substrate to the active site (SI Appendix, Fig. S5). In the structure of BmdB-Cy2, the F1118 side chain completely blocks that tunnel. This residue must move to allow alanyl-PCP1 to bind and participate in the condensation reaction. However, for cyclization, PCP1 likely departs, leaving the cyclization substrate Ala-Cys-PCP2 bound at the acceptor site. It is possible that F1118 aids cyclization by blocking the donor side of the tunnel to create a single-entry active site more optimal for cyclodehydration. This feature is reminiscent of the (permanent) blocking of the acceptor site by a tryptophan side chain to form a single-entry active site in related epimerization domains, observed in a recently published crystal structure (25). Even more striking are the effects of the T1196A mutation, which allowed only minimal cyclization, and mutations of D1226 to glycine or asparagine, which obliterated cyclization while retaining robust condensation. These two residues are adjacent to one another, oriented toward the active site cavity, and hydrogen bond to one another. Their position and importance for cyclization activity make them the most likely to play a direct role in catalysis (see below).
Bioinformatic Analysis of Cy Domains.
We undertook a bioinformatic analysis of Cy domains. We retrieved 36,853 C domain superfamily sequences that had a maximum pairwise sequence identity of 90% and sorted them by using the Natural Products Domain Seeker web server (36). Multiple sequence alignment of 1,790 Cy domains showed two motifs in the C-terminal region that stand out as highly conserved in Cy domains but absent from the other C domain superfamily proteins (SI Appendix, Fig. S6). These motifs bear the consensus sequences PVVFTS and SQTPQVxLD (Fig. 4A) and are part of what had been recognized by Konz et al. (6) as conserved signature sequences 6 and 7. They exist in BmdB-Cy2 as PIVFTS (1192–1197) and ARTPQVYLD (1218–1226). These motifs are as close as 9 Å in space to the DxxxxD, but they are separated from it in sequence by more than 200 amino acids. The motifs form a surface on the acceptor substrate side of the active site (Fig. 4B and SI Appendix, Fig. S6A). Interestingly, this region is the putative location of the Ala-Cys-phosphopantetheinyl substrate in the cyclization reaction and includes the residues that displayed the most drastic effect on cyclization when mutated, namely T1196 of the PVVFTS motif and D1226 of the SQTPQVxLD motif. D1226 is one of the most conserved residues in Cy domains, present in 96% of the 1,790 sequences. T1196 is somewhat more variable: It is a threonine or serine in 88% of the sequences.
Model of the cyclodehydration intermediate and critical residues for cyclodehydration reaction. (A) WebLogo3 (57) of Cy domain motifs PVVFTS (core Cy6) and SQTPQVxLD (part of core Cy7) compared with the sequence of BmdB-Cy2. Putative catalytic residues T1196 and D1226 are labeled with yellow asterisks. (B) BmdB-Cy2 (mostly green, with DxxxxD in orange, PVVFTS in purple, and SQTPQVxLD in brown) with a model of the cyclodehydration intermediate (blue). Putative catalytic residues T1196 and D1226 are shown in sticks, as is V1228, a position occupied by glutamine in most Cy domains. (C) Possible mechanism of the dehydration step. We suggest that deprotonation of the amino hydrogen occurs after dehydration and double bond formation. The full putative reaction pathway is diagramed in SI Appendix, Fig. S7.
Putative functions had not been assigned to these motifs. S1197 was the only residue in either motif previously targeted by mutation, and it had differential effects on Cy domain reactivity in EpoB, BacA2, and BmdB, consistent with its moderate conservation (Figs. 3 and 4A and SI Appendix, Table S1) (10, 12). Notably, a portion of Cy domain sequence that largely overlaps with the two new motifs is assigned PFam08415. It is annotated only as being found in NRPSs with C and PCP domains, and ends partway through the SQTPQVxLD motif, before the critical D1226. We propose that this motif be extended to residue 1241 (to incorporate the whole SQTPQVxLD and a conserved WD; SI Appendix, Fig. S6A) and annotated as a signature for Cy domains.
A Trend for Tandem Cy Domains in (Methyl)oxazoline-Forming Modules.
Cy domains can be subdivided by their use of thiol (Cys) or hydroxyl (Ser or Thr) groups as the nucleophile in the cyclization reaction. Each subset contains the highly conserved PVVFTS and SQTPQVxLD motifs, and there are no discernable differences between Cy domain sequences in each subset. However, one clear trend did emerge: In general, modules that incorporate Cys substrates contain only one Cy domain per module, whereas those using Thr or Ser have tandem Cy-Cy (or Cy-C) domains. Of a set of 505 full-length Cy domain-containing proteins, 454 are predicted to use Cys (440 with single Cy domains, 14 with Cy-Cy domains), whereas 51 are predicted to use Ser or Thr (2 with a single Cy domain, 49 with Cy-Cy domains). This trend is maintained after the proteins have been filtered for unique architecture and/or unique predicted product: Of 200 remaining proteins, 187 use Cys (177 with a single Cy domain, 10 with Cy-Cy domains) and 13 use Ser or Thr (1 with single a Cy domain, 12 with Cy-Cy domains).
The trend of tandem Cy domains for hydroxyl-containing substrates and single Cy domains for thiol-containing substrates holds for most characterized NRPS systems that feature Cy domains (6⇓–8, 10, 12, 22, 23, 37⇓⇓⇓–41). The tandem Cy domain arrangement is exemplified by the well-characterized VibF protein. In VibF, Cy2 is primarily responsible for condensation and Cy1 for cyclodehydration of the Thr substrate (22). Furthermore, the tandem Cy arrangement can be maintained even when the module is split between two proteins, as in serratiochelin synthetase SchF1 and SchF2, which also cyclodehydrates Thr (40, 41). Exceptions to the trend are mycobactin synthetase (42) (Thr-specific module with a single Cy domain in MbtB) and anguibactin synthetase (Cys-specific module with Cy-Cy domains in AngN) (27). The tandem Cy domains in anguibactin synthetase could be a remnant of evolution from a Thr-using ancestor, because, other than containing a Cys in place of a Thr, actinomycin is identical to acinetobactin, and the two synthetases share identical domain configurations (43). Furthermore, the Cy domains in AngN can each perform both condensation and cyclodehydration reactions and are largely redundant with one another (27).
The Cys thiol is a better nucleophile than the Ser or Thr hydroxyl (44). This difference in reactivity may be why thiazoline-forming NRPS modules are ∼10× more prevalent than (methyl)oxazoline-forming modules and may explain the trend to dedicate two domains in tandem in (methyl)oxazoline-forming modules. Tandem domains may increase the probability of cyclodehydration before the peptide is donated in the downstream module’s condensation reaction. The downstream C domains are likely not completely selective for the heterocycle-containing peptide, as shown with BmdB (Fig. 3) and VibF (28), so efficiency of cyclodehydration would be important for cognate heterocycle production.
Insight into Catalysis and Model of the Cyclodehydration Intermediate.
The first reaction performed by Cy domains is condensation. The N1114A and T1116A mutants of BmdB-Cy2 exhibited the greatest effects on condensation without fully abolishing it (Fig. 3), and neither residue is highly conserved (SI Appendix, Fig. S6A). Likewise, all mutations that were reported to decrease condensation in Cy domains are shown by the BmdB-Cy2 structure to be too distal to (residues 900, 988, 1089, 1114, 1120) or turned away from (residues 980, 985 of the DxxxxD motif) the atoms participating in condensation (SI Appendix, Table S1) (10, 12, 23, 27, 28). Unless the DxxxxD motif radically reorients (which is possible, although there is no supporting data), there do not appear to be any critical, conserved residues that could act to abstract or donate a proton in the condensation reaction in Cy domains. We have argued that substrate orientation is the principal source of catalytic power for condensation in C domains (21), and this hypothesis appears to be even more likely in Cy domains. Comparison of the kinetic parameters of uncatalyzed peptide bond formation and ribosome-catalyzed peptide bond formation shows that the rate enhancement provided by the ribosome comes predominantly from the lowering of activation entropy via positioning of ester substrates (and perhaps water molecules) for reaction (45). Thus, Cy domains should be able to rely principally on substrate positioning to catalyze condensation of their more reactive thioester substrates.
The second reaction performed by Cy domains, cyclodehydration, is a nonsimplistic, two-step reaction. Cy domains and at least two other types of enzymes perform this reaction to form five-membered rings in independent natural product biosynthesis systems. YcaO- and TruD-type enzymes processively modify ribosomally synthesized and posttranslationally modified peptides by using cyclodehydration reactions to introduce thiazoline and (methyl)oxazoline rings in the synthesis of peptides such as microcin and the trunkamides (46). YcaO and TruD, which are structurally unrelated to Cy domains, covalently attach a phosphate or adenylate from ATP to the carbonyl oxygen to promote both cyclization and dehydration (46). In Cy domains, which do not use ATP, the energetics of cyclodehydration is presumably linked to the high-energy thioester bond that is broken in the subsequent step of bacillamide synthesis. Cy domains share this characteristic with the newly described specialty C domain, NocB-C5, that catalyzes β-lactam ring formation (47). NocB-C5 is proposed to dehydrate a serine to β-alanine via base catalysis by a histidine directly upstream of its HHxxxD motif, prior to cyclization. Dehydration before cyclization is the reaction order shared with lantibiotic cyclodehydratases, but not with Cy domains (48). The position of the upstream histidine is occupied by a hydrophobic residue in Cy domains, and this V979 in BmdB-Cy2 is somewhat recessed from the active site. Thus, Cy domains rely on a completely different mechanism for catalysis than YcaO, TruD, NcoB-C5, or the lantibiotic cyclodehydratases.
To help integrate our structural, mutagenic, and conservation data, we created a model of the intermediate of the BmdB-catalyzed cyclodehydration reaction (Fig. 4B). The configuration of the model and the present data are consistent with the absolutely critical catalytic action of D1226 in the first step of the cyclodehydration reaction, orienting and abstracting a proton from the thiol (or hydroxyl in Ser- or Thr-specific Cy domains) (SI Appendix, Fig. S7). T1196, 3.0 Å from D1226, appears well positioned to aid in catalysis by interacting with D1226, and may donate a proton to the former carbonyl oxygen to form the hydroxyl-thiazolidine. D1226 may protonate the same oxygen again to allow it to leave as water in the dehydration reaction (Fig. 4 B and C and SI Appendix, Fig. S7). Upon dehydration, the pKa of the amino proton is lowered such that it can be facilely lost to solvent to produce the final thiazoline moiety.
Our mutational results indicate that D1226 is critical for cyclodehydration, and T1196 is nearly as important (Fig. 3A). However, T1196 is only a threonine or serine in 88% of Cy sequences, and appears as an alanine in 7% of cases, including in the functionally characterized PchE-Cy1 (39). How can its apparent importance in the reaction and its relative lack of conservation be reconciled? Homology modeling of PchE-Cy1 onto BmdB-Cy2 provides a potential answer: The residue at the nearby position equivalent to 1228 of PchE-Cy1 is a glutamine, which could compensate for the missing T1196. A double mutant of T1196A/V1228Q did not restore cyclization activity to BmdB (SI Appendix, Fig. S8), but is not surprising given the BmdB-Cy has evolved to perform cyclodehydration with T1196 and not Q1228. In the analyzed Cy domain sequences that have an alanine at position 1196, all but two have a glutamine at 1228 (with the remaining two having glutamate or asparagine). In the majority of Cy domains, all three residues are present to form a T-D-Q triad. A similar S/T-D-Q catalytic triad in Escherichia coli thioesterase II activates water for nucleophilic attack (49). We suggest that mutation of the threonine or glutamine in the Cy domain triad may be tolerated in some instances, because chemistry is not rate-limiting in the overall slow synthetic cycle of NRPSs, and at least in thiazoline-forming Cy domains, it would not be difficult to deprotonate the Cys side chain (pKa of ∼8.2 in aqueous solution). Deprotonation of Ser and Thr side chains is more difficult and could require the full catalytic triad. Variation in catalytic residues is not abnormal in nature, as exemplified by serine protease active sites (50). Analogous to the observed variation of T1196, thioesterase II enzymes feature hydrophobic residues in place of the catalytic serine/threonine in more than 5% of proteins.
Finally, parallels can be drawn between the role of D1226 in deprotonating the substrate side chain for nucleophilic attack with the mechanisms of microbial transglutaminase, cytosolic phospholipase A2, patatin, and TEM-1 β-lactamase, in which the aspartate of a C-D or S-D dyad deprotonates the cysteine or serine for nucleophilic attack on a carbonyl carbon (51⇓–53).
During the review process for this study, a structure and mutational analysis of the heterocyclization domain of epothilone synthase (EpoB-Cy) was reported by Dowling et al. (54). The structures of EpoB-Cy and BmdB-Cy2 are similar. Their mutational data are consistent with those presented here, in particular the identification of D1226 (D449 in EpoB-Cy) as important for catalysis, although without differentiation between defects in condensation and heterocyclization (SI Appendix, SI Results and Discussion and Table S1). Their study and ours complement one another and provide a greater understanding of NRPS heterocyclization domains.
Conclusion
In summary, we have presented the crystal structure of an NRPS heterocyclization domain, solved to a resolution of 2.3 Å. We cloned, expressed, and purified the entire bacillamide synthetase containing this Cy domain, and used it to assay effects of Cy domain mutations on peptide production. We were able to identify two residues, T1196 and D1226, which are important for the cyclodehydration reaction. Finally, we presented a putative mechanism for cyclodehydration in which D1226 acts as general acid/base catalyst.
Methods
Bacillamide Synthetase Cy2 Crystallography.
The Thermoactinomyces vulgaris BmdB-Cy2 construct, with an N-terminal octahistidine tag, tobacco etch virus (TEV) protease cleavage site, and BmdB residues 844–1287, was synthesized by DNA 2.0. BmdB-Cy2 was heterologously expressed in E. coli and purified to homogeneity (SI Appendix, SI Methods). BmdB-Cy2 was crystallized by using 0.88% Tween 20, 1.62 M ammonium sulfate, 0.1 M Hepes pH 7.5, 2.67% (wt/vol) PEG 400, and 3% (wt/vol) 6-aminohexanoic acid, with 20% (vol/vol) ethylene glycol, added before flash-cooling. Diffraction data were collected at CLS beamline 08ID-1 (SI Appendix, Table S2) and the structure of BmdB-Cy2 was determined by molecular replacement.
Peptide Synthesis Assay for Full-Length BmdB.
bmdB was cloned from T. vulgaris F-5595 genomic DNA into a pET21-derived vector containing an N-terminal TEV-cleavable calmodulin binding peptide tag and a C-terminal TEV-cleavable octa-histidine tag. Mutations were introduced via site-directed mutagenesis (SI Appendix, Table S3). Wild-type and mutant protein was expressed in E. coli cells and purified to homogeneity. In a 1-mL reaction, 50 mM Tris⋅HCl pH 7.5, 100 mM NaCl, and 10 mM MgCl2, 1 mM Ala, 1 mM Cys, 10 mM tryptamine, 2 mM ATP, and 100 nM BmdB were incubated for 2 h at 37 °C. Reactions were analyzed by HPLC using C18 media and a gradient of water/0.1% TFA to acetonitrile/0.1% TFA. High-resolution MS and fragmentation MS analysis was performed at the Proteomics Platform at the Research Institute of the McGill University Health Centre, and NMR analysis at Quebec/Eastern Canada High Field NMR Facility (QANUC), McGill University.
Bioinformatic Analysis.
Sequences of known C domains were queried against the nr90 database (55), using low threshold search. Matched sequences were themselves filtered for maximum pairwise sequence identity of 90%, which gave 36,853 C domain superfamily sequences. These sequences were classified into subtypes by using NaPDoS (36), and included 1,790 Cy domains. To look for trends in Cy domains by substrates, all full-length protein sequences from which the 1,790 Cy domains originate were retrieved. Sequences were discarded if there was not an A domain adjacent to the Cy domain, or if they started as a single Cy domain leaving a set of 505 proteins. The Cy domain acceptor/cyclodehydration substrates were inferred from substrate of the A domains in the same module, as predicted by the program ANTISMASH (56).
Acknowledgments
We thank Chaitan Khosla and James Kuo for the kind gift of BAP1 cells; Vikram Alva and Johannes Soeding for providing the current nr90 database; Christopher Boddy and Adrian Keatinge-Clay for conversations; Asfarul Haque for performing the ITC experiments; Janice Reimer for schematic figure design; the Proteomics Platform at the RI-MUHC (Montreal, Canada) for LC-MS analysis; members of the T.M.S. laboratory for helpful discussions; Alexander Wahba for MS analysis; Varoujan Yaylayan for fragmentation MS assignment; and Robin Stein for NMR analysis.
Footnotes
- ↵1To whom correspondence should be addressed. Email: martin.schmeing{at}mcgill.ca.
Author contributions: K.B. and T.M.S. designed research; K.B. and T.M.S. performed research; K.B., C.D.F., M.A.M., and T.M.S. analyzed data; and K.B. and T.M.S. wrote the paper with comments from C.D.F. and M.A.M.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 5T3E).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1614191114/-/DCSupplemental.
References
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Marshall CG,
- Burkart MD,
- Keating TA,
- Walsh CT
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Snow GA
- ↵.
- Drechsel H, et al.
- ↵
- ↵
- ↵
- ↵.
- Bloudoff K,
- Alonzo DA,
- Schmeing TM
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Marshall CG,
- Burkart MD,
- Meray RK,
- Walsh CT
- ↵
- ↵
- ↵
- ↵
- ↵.
- Yuwen L, et al.
- ↵
- ↵
- ↵
- ↵.
- Du L,
- Chen M,
- Zhang Y,
- Shen B
- ↵.
- Silakowski B, et al.
- ↵
- ↵
- ↵.
- Wang H,
- Fewer DP,
- Holm L,
- Rouhiainen L,
- Sivonen K
- ↵.
- McMahon MD,
- Rush JS,
- Thomas MG
- ↵
- ↵
- ↵.
- Beringer M, et al.
- ↵
- ↵
- ↵.
- Zhang Q,
- Yu Y,
- Vélasquez JE,
- van der Donk WA
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Dowling DP, et al.
- ↵.
- Biegert A,
- Söding J
- ↵.
- Weber T, et al.
- ↵.
- Crooks GE,
- Hon G,
- Chandonia JM,
- Brenner SE
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Biological Sciences
- Biochemistry