Deciphering the evolution of flavin-dependent monooxygenase stereoselectivity using ancestral sequence reconstruction

Edited by Janet Thornton, European Biophysics Institute, Cambridge, United Kingdom; received October 25, 2022; accepted March 6, 2023
April 4, 2023
120 (15) e2218248120


Biocatalysts provide powerful tools for target-oriented synthesis. However, it is difficult for biocatalytic reactions to achieve complementary selectivity and afford different stereoisomers given each biocatalyst’s inherent preference to form one isomer over another. A complete understanding of protein sequence–selectivity relationships is thus necessary to identify, engineer, and design useful biocatalysts. Here, we employ ancestral sequence reconstruction (ASR) to navigate a historical sequence space of enantiocomplementary biocatalysts, enabling us to overcome traditional limitations of direct residue substitutions to study changes in biocatalyst stereoselectivity. Through the case study demonstrated here, we uncover how evolution allows stereoselectivity to be tuned to afford stereocomplementary natural products. We demonstrate the application of ASR to find protein sequence-stereoselectivity relationships to guide future protein engineering efforts.


Controlling the selectivity of a reaction is critical for target-oriented synthesis. Accessing complementary selectivity profiles enables divergent synthetic strategies, but is challenging to achieve in biocatalytic reactions given enzymes’ innate preferences of a single selectivity. Thus, it is critical to understand the structural features that control selectivity in biocatalytic reactions to achieve tunable selectivity. Here, we investigate the structural features that control the stereoselectivity in an oxidative dearomatization reaction that is key to making azaphilone natural products. Crystal structures of enantiocomplementary biocatalysts guided the development of multiple hypotheses centered on the structural features that control the stereochemical outcome of the reaction; however, in many cases, direct substitutions of active site residues in natural proteins led to inactive enzymes. Ancestral sequence reconstruction (ASR) and resurrection were employed as an alternative strategy to probe the impact of each residue on the stereochemical outcome of the dearomatization reaction. These studies suggest that two mechanisms are active in controlling the stereochemical outcome of the oxidative dearomatization reaction: one involving multiple active site residues in AzaH and the other dominated by a single Phe to Tyr switch in TropB and AfoD. Moreover, this study suggests that the flavin-dependent monooxygenases (FDMOs) adopt simple and flexible strategies to control stereoselectivity, which has led to stereocomplementary azaphilone natural products produced by fungi. This paradigm of combining ASR and resurrection with mutational and computational studies showcases sets of tools for understanding enzyme mechanisms and provides a solid foundation for future protein engineering efforts.
Azaphilones are natural products of fungal polyketides characterized by an oxygenated pyranoquinone bicyclic core containing a tetrasubstituted chiral center with either the R- or S-configuration (Fig. 1A). Through 2019, more than six hundred unique azaphilone natural products have been reported with many of these molecules displaying potent biological activities including antitumor (1), antiviral (2), antimicrobial (3), and antiinflammatory (4) properties that are tied to specific structural features (5, 6). In addition to the conserved azaphilone core, the biological properties of these natural products are tied to the stereocenter present in the bicyclic core. For example, an (R)-berkchaetoazaphilone A (1) shows antitumor and antiinflammatory activity (7), while the structurally similar (S)-lenormandin E (6) demonstrates cytotoxic and antimicrobial activities (8). Given the potential of azaphilones as tool compounds in chemical biology (9) and as pharmaceutical agents (10), an efficient, stereocontrolled method for their synthesis is desirable (11, 12).
Fig. 1.
Chemoenzymatic syntheses of azaphilone natural products through a stereodivergent oxidative dearomatization step empowered by comprehensive studies on the sequence-structure-stereoselectivity relationship of FDMOs. (A) Azaphilone natural products feature a critical chiral center, which can be accessed through enzymatic oxidative dearomatization catalyzed by TropB, AfoD, or AzaH. (B) Sequence alignment of TropB, AfoD, and AzaH focused on the residues within 5 Å of the substrate-binding site of TropB reveals five highlighted and bolded residues with potential contributions to the stereocontrol of the enzymes. (C) Horizontal and vertical approaches based on sequence alignments provide two strategies to study enzyme mechanisms. (D) Overlapping of AfoD (PDB ID: 7LO1) and TropB cocrystallized with its native substrate 8 (PDB ID: 6NET). AfoD is shown as a cartoon model, and TropB as a transparent surface model. Zoom-in of the active site of TropB overlaid with that of AfoD highlights two major residue differences. AfoD and TropB are shown in yellow and blue, respectively, with bolded AfoD labels. (E) Overlapping of an AzaH homology model predicted by AlphaFold and TropB cocrystallized with its native substrate 8 (PDB ID: 6NET). AzaH is shown as a cartoon model, and TropB as a transparent surface model. Zoom-in of the active site of TropB overlaid with that of AzaH highlights three major residue differences. AzaH and TropB are shown in green and blue, respectively, with bolded AzaH labels. (F) Stereodivergent oxidative dearomatization of substrate 10 catalyzed by AfoD, AzaH, or one of their variants. (G) Mutagenesis studies of AfoD and AzaH. Reaction conditions: 10 μM purified AfoD WT, AzaH WT, or protein variants with 2.5 mM substrate 10 in 50 mM potassium phosphate (KPi) buffer pH 8.0 and 30 °C for 1 h in the presence of 1.0 mM NADP+, 2.0 U/mL glucose-6-phosphate dehydrogenase (G6PDH), and 5.0 mM glucose-6-phosphate (G6P). Oxidative dearomatization reactions were run in triplicate or quadruplicate, and values reported are percentage conversions of substrate 10. Ratios of (R)-11 to (S)-11 of each enzyme are shown with (R)- and (S)-11 in blue and yellow, respectively.
In nature, oxidative dearomatization is the stereodivergent step in azaphilone synthesis, installing the conserved stereocenter (see 1 to 2 and 6 to 7, Fig. 1A) (12, 13). This oxidative dearomatization approach provides control over both the site-selectivity and stereoselectivity that can be challenging to match with established small molecule methods for oxidative dearomatization utilizing hypervalent iodine (1416) or Pb(IV) reagents (17, 18). The specific tools for this transformation in azaphilone biosynthetic pathways are flavin-dependent monooxygenases (FDMOs), which have evolved as highly stereoselective catalysts. FDMOs can operate at low catalyst loadings and mild reaction conditions, and utilize molecular oxygen as the stoichiometric oxidant (Fig. 1A) (19).
As the selective formation of either enantiomer is required to access the breadth of azaphilone natural products, a chemoenzymatic strategy demands biocatalysts that can deliver either enantiomer on demand. One common approach toward stereocomplementary biocatalysts is protein engineering to identify enzyme variants that provide access to a desired enantiomer (20). For instance, an old yellow enzyme (OYE) variant was engineered using site-saturation mutagenesis to possess stereoselectivity opposite to the wild-type enzyme (21). An alternative approach to accessing stereocomplementary biocatalysts is by directly identifying them from nature (22). Our group previously demonstrated chemoenzymatic syntheses of azaphilone natural products including trichoflectin, deflectin-1a, and lunatoic acid A (23). In this study, enantiocomplementary enzymes were identified and leveraged for the divergent synthesis of azaphilone natural products (23). Specifically, from a sequence similarity network of FDMOs, we identified two FDMOs in neighboring clusters to well-studied TropB (24, 25), AzaH in azanigerone A biosynthesis (26) and AfoD in the asperfuranone (27) pathway (23). Interestingly, although all three FDMOs have conserved site-selectivity, providing C3-hydroxylated products, the stereoselectivity differs with TropB and AzaH affording the (R)-product, whereas AfoD yields the enantiomeric (S)-product (Fig. 1A). Site-selectivity and stereoselectivity in reactions with this class of enzymes are dictated by the substrate-binding pose within the active site, generally producing a single enantiomer of the product (28). Our previous work revealed that in TropB, a two-point binding interaction of the substrate with active site residues Arg206 and Tyr239 is responsible for the observed site-selectivity and stereochemical outcome (Fig. 1B) (29). The effectiveness of these biocatalyst engineering and identification approaches relies on prior knowledge of the natural selectivity of each enzyme. Thus, identifying structural elements that control the selectivity of a given enzyme class is fundamental to altering the selectivity. Here, we uncover structural and mechanistic features used by nature to precisely control stereoselectivity in FDMO-mediated oxidative dearomatization reactions. Additionally, through this case, we demonstrate a systematic method to study these kinds of selectivity questions.
Swapping key active site residues using sequence-based or structural analysis rarely leads to a successful switch in the function of one enzyme to that of another. This is likely due to epistasis, a context-dependent effect leading to different mutational outcomes for a single change made in different backgrounds (30). To overcome potential limitations from epistasis, understand how these stereochemical variations occurred within the family of FDMOs, and identify the key elements which provide stereocontrol, we applied ancestral sequence reconstruction (ASR) (3133) and resurrection to explore the historic sequence space of FDMOs. ASR allows us to recapitulate the evolutionary trajectories of these catalysts (Fig. 1C) (30). Compared to the horizontal approach in comparative studies of homologous modern enzymes using multiple sequence alignments (MSAs) with site-directed mutagenesis, ASR is a “vertical” approach that commences with statistical inference of ancestral protein sequences, followed by reconstruction of the ancient protein sequences (Fig. 1C) (30). Previous studies have shown that ancestral proteins are likely more catalytically promiscuous, soluble, and stable than their descendants, all desirable properties for structural and mechanistic studies (3440). Most importantly, ASR can illuminate historic substitutions that cause functional changes that otherwise might be easily obscured by epistasis and a focus only on extant proteins (4143). Based on these advantages, ASR has been successfully applied to answer fundamental questions on topics such as the evolution of hormone receptors (44), apicomplexan lactate dehydrogenases (45), coral pigments (46), and thermophilicity in RNases (47). To date, a few examples of enzyme pairs have been shown to afford two complementary stereoisomers such as (R)- and (S)-citronellal from OYE pairs (48) and (1′S,2′S)- and (1′R,2′R)-3-(2-nitrocyclopropyl)alanine from nonheme iron enzyme pairs (49, 50). However, it is not well understood how stereoselectivity evolves quickly to produce different stereoisomers. Herein, utilizing the FDMO family as an example, we demonstrate the utility of ASR to study the evolution of enzyme stereoselectivity, providing an opportunity to leverage nature’s evolutionary strategies in future protein engineering campaigns.

Results and Discussion

Structure Analysis of TropB, AfoD, and AzaH.

To begin to build a structural understanding of the origins of stereoselectivity, we sought to obtain additional structures of enzymes in this class to elucidate the basis for selectivity. To this end, we solved a 2.0 Å crystal structure of AfoD with a bound flavin adenine dinucleotide (FAD) cofactor (PDB ID: 7LO1, Fig. 1D) and created a model of AzaH with AlphaFold (51) (Fig. 1E). In contrast to the dimeric structure we previously observed for TropB (29), AfoD was monomeric in the crystal form and in solution (SI Appendix, Fig. S12). Typical of class A FDMOs, AfoD is composed of a FAD-binding domain with a site for the adenosine diphosphate moiety and a catalytic domain, and also employs a “wavin’ flavin” mechanism for cofactor reduction in which the FAD isoalloxazine ring rotates between an “in” position for substrate engagement and an “out” position for reduction by NAD(P)H (52). As expected for proteins with greater than 37% sequence identity (TropB/AzaH: 37%, TropB/AzaH: 46%, and AfoD/AzaH: 42%, SI Appendix, Table S1), the overall architectures of TropB, AzaH, and AfoD are similar. Interestingly, the FAD isoalloxazine ring is in the out position where it π-stacks with Tyr297 in the AfoD structure (Fig. 1D and SI Appendix, Fig. S13), whereas the “in” position occurs in the TropB structure (PDB ID: 6NES).
In the active site cavity, many residues differ across TropB, AfoD, and AzaH (Fig. 1B). Notably, AfoD Phe237 occupies the position of TropB Tyr239, whose phenolic group is essential to substrate binding and catalysis (Fig. 1D) (29). On the opposite side of the AfoD active site, Tyr118 (Phe119 in TropB) is near the FAD isoalloxazine ring, where it could fulfill a similar role to that of Tyr239 in TropB (Fig. 1D). Based on the location of these Tyr residues, we hypothesized that the opposite positions of these phenolic functional groups control the face of the substrate presented for hydroxylation, providing a simple way for nature to design enantiocomplementary catalysts. However, the related enzyme, AzaH, does not possess such a Tyr residue around the active site, implying that AzaH might adopt a different stereocontrol mechanism (Fig. 1E). To investigate the R-selectivity observed in reactions with AzaH, we identified three additional positions with major amino acid differences, including Gly48, Phe80, and Arg219 (Ala55, Leu96, and Leu226 in TropB, Fig. 1E). Based on the AzaH model, these amino acids align with the analogous amino acids in both TropB and AfoD, providing a solid foundation for mutagenesis experiments to test the impact of these residues on stereoselectivity.

Mutagenesis to Interrogate Features Dictating FDMO Stereoselectivity.

Based on the structures and sequence alignment of TropB, AzaH, and AfoD, we designed mutagenesis experiments targeting positions 118 and 237 in AfoD (119 and 239 in TropB), and positions 80 and 219 in AzaH (96 and 226 in TropB) and hypothesized that these residues are potential determinants of substrate positioning in the active site (Fig. 1B). As each enzyme has different residue numbering for equivalent positions, here we will use TropB numbering for consistency. The role of TropB Y239 was previously probed; however, the Y239F variant did not convert 3-methylorcinaldehyde (8) to any detectable product (29). In AfoD and AzaH, a set of single- and double-substitution variants was generated and tested using a model substrate with structural similarities to the AzaH and AfoD native substrates but with a shorter C6 aliphatic sidechain to measure activity (see 10, Fig. 1F). Each variant was assayed in reactions with the achiral substrate 10 by monitoring substrate consumption and enantiomeric excess of the product (SI Appendix, Figs. S23 and S24 and Table S14). AfoD wild type (WT) had an average percent conversion of 36%, and AfoD variants Y119F, F239Y, and Y119F/F239Y provided 23%, 35%, and 22% percent conversion, respectively (Fig. 1G). AfoD WT was highly enantioselective, producing (S)-11 in >99% e.e., whereas, AfoD variants Y119F, F239Y, and Y119F/F239Y showed 98%, 51%, and 5% e.e., respectively. Unexpectedly, the Y119F substitution alone had minimal impact on the stereoselectivity. AfoD became less selective only after the F239Y substitution was introduced. With the same substrate, 10, AzaH WT demonstrated >99% conversion to provide (R)-11 in >99% e.e. (Fig. 1G). The F96L substitution led to decreased activity, resulting in a variant that provided 31% conversion of (R)-11 in >99% e.e. (Fig. 1G). The R226Q substitution was detrimental to AzaH activity, leading to no conversion in both the R226Q and F96L/R226Q variants, precluding determination of the impact of these residues on stereoselectivity (Fig. 1G). These data support that the hypothesis that the residue at position 239 (Phe or Tyr) in AfoD plays an important role in determining the stereoselectivity and that Arg226 is critical to the reactivity of AzaH.

Ancestral FDMO Reconstruction.

These variants revealed that introducing our targeted substitutions within TropB, AfoD, or AzaH did not result in a switch of facial selectivity, but rather yielded nonfunctional enzymes or those with decreased activity and/or selectivity. This is a hallmark of epistatic effects commonly occurring when substitutions are directly introduced into extant proteins (modern proteins compared to their ancestors) to study differential enzyme functions (53). To overcome this limitation, we adopted an alternative strategy and employed phylogenetic analysis and ASR. We generated an ancestral FDMO library containing 64 ancestral sequences by deeply sampling the ancestral nodes in the phylogenetic tree built from 276 fungal FDMO sequences and one bacterial root sequence, p-hydroxybenzoate hydroxylase (PHBH) (54). The tree was constructed using maximum likelihood (ML) methods (55, 56), and the ancestral sequences were then inferred from this tree (32). To test the robustness of the reconstruction, alternative ancestral sequences of each node were also inferred (57). These analyses revealed that TropB and AfoD have a relatively recent common ancestor (see Anc311, Fig. 2A) and share a more distant ancestor with AzaH (see Anc302). Using this approach, we successfully added another dimension (time) to the comparative analyses of evolutionarily related extant proteins.
Fig. 2.
ASR and resurrection of the FDMO family affords soluble, FAD-incorporated, and thermostable ancestral FDMOs. (A) Phylogenetic tree of the FDMOs is shown in a polar tree layout and a zoom-in focus on the TropB and AfoD clades. The ancestral nodes (sequences) in the tree are numbered from 278 to 553. (B) SDS–PAGE demonstrates that the ancestral FDMOs were successfully overproduced and remained soluble in E. coli BL21 (DE3). (C) Ancestral FDMOs incorporated moderate amounts of FAD compared with three extant FDMOs. (D) Melting temperatures of three extant and four ancestral FDMOs determined by differential scanning fluorimetry. The thermostability of the ancestral FDMOs was enhanced over their extant FDMOs.
To validate that the ancestral FDMOs were successfully resurrected, we first analyzed the soluble fraction of each cell lysate using SDS–PAGE and saw that 60 out of 64 ancestors showed prominent bands at the expected molecular weights (~50 kDa, Fig. 2B and SI Appendix, Figs. S1 and S2), indicating productive overexpression and the production of soluble protein. FAD incorporation was quantified for four ancestral FDMOs, Anc311, Anc321, Anc370, and Anc374. Compared with three extant FDMOs with 72% average incorporation, ancestral FDMOs had lower but acceptable percent FAD incorporation from 38 to 48% (Fig. 2C). Protein thermostability was also measured as many previous studies reported higher stability of ancestral proteins than that of their extant forms (3739). Differential scanning fluorimetry was carried out using SYPRO-orange dye to determine the melting temperature of the ancestral enzymes and variants thereof as well as wild-type TropB, AzaH, and AfoD. Among the three extant proteins, AzaH was the least stable protein with a melting temperature (Tm) of 36 °C (Fig. 2D). TropB and AfoD showed similar thermal stability, Tm = 41 and 45 °C, respectively (Fig. 2D). Impressively, the thermostability of the ancestors was greatly enhanced with the highest Tm of 73 °C and an average of 67 °C (Fig. 2D and SI Appendix, Table S5). Together, these results showcase that the ancestral FDMOs were successfully reconstructed with many desirable properties for research targets, including solubility, stability, and cofactor incorporation.

Substrate Screening and Enantioselectivity Determination of the Ancestral FDMO Library.

With the ancestral library in hand, we next identified an optimal substrate for studying enzymes across the library. First, to quantify the yield and stereoselectivity of the reaction, we designed a high-throughput workflow that was compatible with biocatalytic reactions in clarified cell lysates in a 96-well plate format and with reaction analysis using UPLC-DAD-CD (SI Appendix, Figs. S16–S20 and Tables S8–S12). Experiments with the ancestral FDMO library included one negative control, three extant proteins, and 64 ancestral proteins. The reactivity of each enzyme in the library was assessed with four substrates (see 9, 10, 13, and 15, Fig. 3). The reactivity of each enzyme with each of the four substrates was determined by measuring substrate consumption compared to the negative control to account for any background conversion from other components in cell lysates.
Fig. 3.
Scope of oxidative dearomatization reactions catalyzed by the ancestral and extant FDMOs in the library. The FDMOs exhibited substrate promiscuity and maintained the stereoselectivity with resorcinols having varying substituents, enabling the utility of 10 as a representative substrate for the following enantioselectivity determination. Clarified cell lysates in pH 7.8 50 mM Tris-HCl buffer with 300 mM NaCl and 10% (v/v) glycerol were used for the screening. Reactions were carried out using the clarified cell lysates in mixed Tris-HCl and 50 mM potassium phosphate (KPi) buffer pH 8 at 30 °C for 1 h in the presence of 1.0 mM substrate (91013, or 15), 0.4 mM NADP+, 0.8 U/mL glucose-6-phosphate dehydrogenase (G6PDH), and 2.0 mM glucose-6-phosphate (G6P). A negative control (cells with empty pMCSG7 vector) was included in the library to correct for background reactions. Percentage conversions calculated based on the substrate consumption are displayed as a gradient heatmap from 0 to 100%, and the stereochemistry as a ternary color map: Enzymes affording products with the same positive CD signal peak as AzaH are assigned as R shown in blue, and with a negative CD signal as S shown in orange. Enzymes affording dearomatized products but without any detectable CD signal were assigned as racemic (R/S) and shown in gray. Anc311a and AfoD are marked with asterisks since their percentage conversions and stereochemistry data were acquired using purified enzymes instead of clarified cell lysates (see SI Appendix, Figs. S26–S33 and Tables S16–S20 for details).
Overall, out of 64 ancestral proteins, 32 displayed reactivity at sufficient levels to determine enantioselectivity, demonstrating robustness and reliability of this reconstruction (Fig. 3). In the TropB clade from Anc312 to 354, the ancestors of TropB preferred substrates with a longer substituent at C6. Following this trend, the reactivity gradually diminished along the tree toward TropB. Several closely related TropB ancestors, including Anc334, 346, 354, and their alternative sequences (suboptimal reconstructions) only had mild or moderate reactivity toward 10 (Fig. 3). In the AfoD clade, Anc366 to 382 showed higher levels of substrate consumption and promiscuity compared to the ancestors closer to AfoD. Anc370 and 371 showed unexpectedly low conversion, and Anc373 and 374 restored the reactivity when evolving closer to AfoD (Fig. 3). All other distantly related ancestors from Anc400 to 488 had no or low activity with the four substrates tested, implying that they may have a different substrate scope than what we tested. Based on our limited substrate screening, we found that the stereochemical outcome was not specific to a single substrate but instead was conserved across similar substrates, enabling us to study the stereoselectivity using one representative substrate. Therefore, according to the generality of the reactivity across the whole ancestral sequence space and the stability of the substrate itself, 10 was chosen as a model substrate for the experiments with variants designed to probe the role of individual residues on stereoselectivity.
We combined the experimental results and the phylogenetic tree of FDMOs to probe the mechanism of stereoselectivity. Interestingly, the common ancestor of the stereocomplementary enzymes TropB and AfoD, Anc311, affords a nearly racemic mixture of products with a slight preference for the (S)-11 (Fig. 4). Progressing from the common ancestor, Anc311, to either TropB or AfoD, several R to S or S to R switches are observed. For example, the stereoselectivity switches once from S to R in the TropB clade and twice from S to R and back to S in the AfoD clade (Fig. 4). Intriguingly, these results reveal that stereoselectivity prefers the formation of one isomer over the other rather than a gradual progression along the evolutionary pathway. To contextualize the relationship between sequence and function, we used TropB’s substrate-bound structure (PDB ID: 6NET) and selected 19 active site residues inside a 5 Å sphere centered in the substrate. Tracking the change of these residues across proteins enabled us to infer key active site substitutions resulting in the stereoselectivity switch. The enantiomeric excess increased quite dramatically in the first few descendants of Anc311 along the evolution toward TropB, interestingly still favoring a facial selectivity opposite to TropB. The facial selectivity switched from S in Anc332 to R in Anc333, which corresponds with the F239Y substitution in Anc333 (Fig. 4). A very similar observation is found along the AfoD evolutionary pathway, further supporting the hypothesis that position 239 is critical for the facial selectivity of the oxidation event. The F239Y substitution resulted in Anc367, 368, 369, and 382 that favor the production of (R)-11 compared to the common ancestor that produces (S)-11 in slight excess. This trend continued in the next transition from Anc369 to 373 when Tyr239 was switched back to Phe. Again, the (S)-11 was formed at the end of the branch (Fig. 4).
Fig. 4.
The evolution of the stereoselectivity in the TropB and AfoD clades. The evolutionary trajectories toward TropB and AfoD are highlighted. Some of the alternative ancestral FDMOs (suboptimal reconstructions) are shown if they were reactive to demonstrate the robustness of the sequence reconstruction and the consistency of stereoselectivity. Enzymes producing dearomatized product (R)-11 are shown in blue, product (S)-11 producing enzymes in orange, and product (R/S)-11 producing enzymes in gray. Ancestral sequences from both the TropB and AfoD lineages were aligned, and 19 residues within 5 Å of the active site were identified and summarized in the table on the right. The stereochemistry data are from the oxidative dearomatization of substrate 10 using either clarified cell lysates for the screening or purified enzymes if available. Reaction conditions using cell lysates: 50 μL cell lysates, 50 mM potassium phosphate (KPi) buffer pH 8.0, 1.0 mM substrate 10, 0.4 mM NADP+, 0.8 U/mL glucose-6-phosphate dehydrogenase (G6PDH), 2.0 mM glucose-6-phosphate (G6P), 2.25 μL dimethylsulfoxide, and deionized water to a final volume of 75 μL. Reaction conditions using purified enzymes: 10 μM purified enzyme with 1.0 (or 2.5) mM substrate 10 in 50 mM KPi buffer pH 8.0 and 30 °C for 1 h in the presence of 0.4 (or 1.0) mM NADP+, 0.8 (or 2.0) U/mL G6PDH, and 2.0 (or 5.0) mM G6P. The key residue at position 239 is highlighted and bolded, and ratios of (R)-11 to (S)-11 are shown in a horizontal bar graph style. The stereochemistry data of TropB, Anc370, and Anc371 were not available due to low or no conversion of substrate 10.
Our preliminary hypothesis was that the Tyr119 in AfoD has a role in determining the stereoselectivity analogous to that of Tyr239 in TropB. However, based on experiments with ancestral FDMOs covering four 119/239 combinations (F/F, F/Y, Y/Y, Y/F), we found that position 239 dictates the stereochemical outcome of the reaction (Fig. 5A). These results were further supported by introducing substitutions at positions 119 and 239 in Anc373 and Anc374. Analogous to what we observed in AfoD, the F/Y or Y/F substitutions at position 119 only had a minor effect on the stereoselectivity, while the F239Y mutation in Anc374 yielded a racemic mixture of products (SI Appendix, Fig. S25 and Table S15). These results demonstrate that the interaction between Tyr239 and the substrate in TropB- or AfoD-related FDMOs strongly favors a binding pose consistent with the formation of (R)-product. Interestingly, we also observed that a V250F substitution appeared concomitantly with the Y239F in the AfoD clade (Fig. 4), which sterically hinders Arg206 from forming a hydrogen bond with the hydroxyl group of the substrate and fully annihilates the possibility of forming the two-point binding interaction seen in R-selective TropB (Fig. 5B) (29). To understand this on an atomic level, substrate 10 was docked in an AfoD AlphaFold model that incorporated FAD in the “in” state using fast Fourier transform docking (FFTDock) (58). The energetically favored top docking cluster had a flipped binding pose where 10 forms hydrogen bonds with Tyr119 and Gln226 (Fig. 5B). The docking model again aligns with the proposed outcome of the Y239F and V250F substitutions. Encouraged by the results from the TropB and AfoD clades, we further explored the stereocontrol mechanism in the AzaH clade.
Fig. 5.
A few active site residues control the stereoselectivity in the FDMO family. (A) Cladogram overviewing the evolution of TropB, AfoD, and AzaH with the critical stereocontrol residues labeled in the evolutionary trajectories. (B) Substrate 10 in the FAD-incorporated AfoD AlphaFold model deduced from the docking result of the top pose using FFTDock. The surface within 1.4 Å of substrate 10 (orange) is depicted as transparent with critical residues in yellow and FAD in white. (C) Mutagenesis studies of Anc311 and 321 reveal three critical substitutions for the stereocontrol of AzaH, involving two R-favoring substitutions, A55G and Q226R, and one S-favoring substitution, L96F. Reaction conditions: 10 μM purified Anc311, Anc321, or one of their variants with 1.0 mM substrate 10 in 50 mM potassium phosphate buffer pH 8.0 and 30 °C for 1 h in the presence of 0.4 mM NADP+, 0.8 U/mL glucose-6-phosphate dehydrogenase, and 2.0 mM glucose-6-phosphate. Oxidative dearomatization reactions were run in triplicate, and values reported are percentage conversions of substrate 10. Ratios of the product (R)-11 to product (S)-11 of each enzyme are shown with (R)- and (S)-11 in blue and yellow, respectively. (D) Substrate 10 in the FAD-incorporated AzaH AlphaFold model deduced from the docking result of the top pose using FFTDock. The surface within 1.4 Å of substrate 10 (green) is depicted as transparent and critical residues in yellow and FAD in white.

Mutagenesis on Historical Background Reveals the Stereocontrol Mechanism of AzaH.

AzaH and its common ancestor with TropB and AfoD, Anc284, appear inconsistent with the Tyr control mechanism (Figs. 3 and 5A). Both have Phe239 but afford (R)-11 instead of (S)-11. To further investigate the distinct mechanism for stereocontrol seen in AzaH, we introduced a set of substitutions in two S-favoring catalysts, Anc311 and Anc321, to circumvent potential epistasis and low AzaH stability (Figs. 2D and 4). Bringing mutations in the more stable ancestral FDMOs enabled us to investigate the effects of functionally critical but destabilizing substitutions (59). Tracking the predicted evolutionary pathway from Anc302 to Anc311, we hypothesized that substituting the critical active site residues in Anc311 and Anc321 in a reverse direction would transform the enzymes from S-preference back to R-preference in AzaH (Fig. 5A). Three substitutions, A55G, L96F, and Q226R, were targeted and several singly and doubly substituted variants of Anc311 and Anc321 were prepared and analyzed in reactions with 10 (Fig. 5C). Intriguingly, only A55G and Q226R favored the formation of (R)-11, whereas L96F conversely favored (S)-11 (Fig. 5C). According to the FAD-incorporated AzaH model, in contrast to Gln226 in AfoD, Arg226 was modeled to be pointing out the binding pocket, where it can form an interaction with a proximal Asp (Fig. 5D). Together with Q226R, A55G which is located on the other side of the pocket above the substrate also provides a more spacious region for substrate binding. We propose that these two substitutions afford space for the bulky C6-substituent to be accommodated. This hypothesis was further supported through the docking of substrate 10 in the AzaH model (Fig. 5D). In addition to the mechanistic study of AzaH, we found that the L96F mutation improved the reactivity and further increased preference for (S)-11 in Anc311 and Anc321 to the extent comparable to AfoD (Fig. 5C), which demonstrates that we serendipitously acquired a superior enzyme for making (S)-11 than AfoD with higher thermal stability, reactivity, and comparable stereoselectivity. This suggests that ancestral enzymes can prospectively provide better starting points for protein engineering. Mechanistically, all ancestral enzymes in the TropB, AfoD, and AzaH clades should possess a highly similar catalytic machinery as TropB does based on the evidence from the sequence alignment (Fig. 4 and SI Appendix, Table S2). Multiple critical active site residues including Arg206, Pro329, and His331 are fully conserved across the ancestral nodes along the predicted evolutionary pathways to TropB, AfoD, or AzaH (Fig. 4 and SI Appendix, Table S2) (60). Moreover, similar to what we have seen in TropB, our investigation revealed that the model substrate 10 with an experimentally determined pKa of seven binds to the enzymes in the phenolate form at pH 8 (SI Appendix, Figs. S14 and S15), and the substrate binding is prerequisite for stimulating FAD reduction to FADH2 in the catalytic cycle like other class A FDMOs (19) as suggested by the NADPH consumption assay (SI Appendix, Table S6). Altogether, our findings demonstrate that the FDMO ancestors are more suitable for mutagenesis than their extant forms for the purpose of either mechanistic study or protein engineering, and the stereoselectivity of FDMOs is likely to evolve from a cumulative mechanism controlled by multiple residues observed in AzaH and gradually transformed to a mechanism dominated by a single F to Y or Y to F substitution in the TropB and AfoD clades.
This focused study highlights the delicate control of FDMO stereoselectivity by only a few substitutions, which was illustrated through an evolutionary analysis. This constitutes an example of applying ASR to investigate stereoselectivity along an evolutionary pathway. Due to the existence of so many structurally similar but stereocomplementary azaphilones, we anticipate that natural selection must have led to fungal populations that contained FDMOs of a particular stereochemistry.


Guided by the sequence alignment and structures of FDMOs, we identified the critical residues implicated in the control of stereoselectivity. However, limited by the horizontal method based on site-directed mutagenesis within the set of extant proteins, TropB, AfoD, and AzaH, further studies were necessary to fully understand the stereocontrol mechanism in the FDMO family. Thus, we adopted an alternative strategy based on ASR to investigate the evolution of stereoselectivity of the family, providing an example of investigating the mechanistic origin of stereoselectivity. The platform of resurrecting a library of ancestral enzymes from relevant but stereodivergent enzyme pairs, visualizing the evolutionary pathways, and identifying the key residue exchanges in conjunction with the development of a higher throughput method for enantioselectivity determination, enabled an evolutionary study of FDMO stereoselectivity. Experiments indicated that the facial selectivity of this FDMO family can be controlled primarily by a small number of substitutions around the active site and the selectivity remains tunable, as evidenced by multiple switches observed along the evolutionary pathways to TropB/AfoD, to afford products with similar scaffolds but opposite absolute configuration.
In conclusion, we demonstrated the utility of ASR in the mechanistic study of enzyme stereoselectivity, which can be further expanded to other investigations involving chemoselectivity, site-selectivity, atroposelectivity, and different enzymatic functions. We envision that the approach herein will be useful in deciphering more sequence–structure–function relationships, providing a robust strategy and a solid foundation for future protein engineering.

Materials and Methods

Site-Directed Mutagenesis.

Substitutions were generated by site-directed mutagenesis on respective templates. Of note, 25 μL PCR reaction mixtures contained 5 μL 5× Phusion® HF Reaction Buffer, 2 ng/μL template plasmid, 0.2 μM primer, 400 μM dNTPs, 1 unit Phusion® HF DNA Polymerase, and 1% dimethylsulfoxide (DMSO). Amplification was accomplished with the following PCR protocol: 96 °C denaturation for 5 min, 10 cycles of 96 °C for 30 s, Tm (−0.5 °C/cycle) for 30 s, 72 °C for 4 min, 35 cycles of 96 °C for 30 s, Tm −5 °C for 30 s, 72 °C for 4 min, and a final 72 °C extension for 15 min. This was followed by a 10 μL digestion containing 1 μL New England Biolabs (NEB) CutSmart or rCutSmart buffer, 8 μL PCR mixture and 20 units DpnI. The reaction was incubated at 37 °C overnight and was then used to transform chemically competent Escherichia coli DH5α cells. The templates and primers are specified and listed in SI Appendix.

Small-Scale Protein Production in 96-Well Plates.

Chemically competent E. coli strain BL21(DE3) cells were transformed with respective plasmids using standard protocols. The transformed cells were used to prepare glycerol stocks and directly inoculate 400 μL LB media containing 100 μg/mL ampicillin or 50 μg/mL kanamycin in a 96-well plate and grown overnight at 37 °C, 350 rpm. The overnight cultures (50 μL each) were used to inoculate 400 μL TB media containing 100 μg/mL ampicillin or 50 μg/mL kanamycin. The cultures were incubated at 37 °C, 350 rpm for 3 h, cooled to 16 °C, 350 rpm for ~1 h, induced with 0.1 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG) at 16 °C, and then incubated at 16 °C, 350 rpm for 18 h. After overnight expression, the cultures in the 96-well plate were harvested by centrifugation at 2,000 rpm at 4 °C for 10 min. Plates containing harvested cells were stored at −80 °C before lysis (see SI Appendix for details).

Large-Scale Protein Production and Purification.

Chemically competent E. coli strain BL21(DE3) cells were transformed with respective plasmids using standard protocols and grown on LB agar plates containing 100 μg/mL ampicillin or 50 μg/mL kanamycin overnight at 37 °C. A single colony was used to inoculate 10 mL LB media containing 100 μg/mL ampicillin or 50 μg/mL kanamycin. The culture was incubated overnight at 37 °C, 200 rpm. The overnight culture was used to prepare a glycerol stock and inoculate a 2.8-L flask with 1 L TB media containing 4% glycerol (v/v) and 100 μg/mL ampicillin or 50 μg/mL kanamycin. The 1-L culture was incubated at 37 °C, 200 rpm until reaching an optical density at 600 nm (OD600) of ~0.8 and cooled to 16 °C, 200 rpm for 1 h. IPTG was added to 0.1 mM to induce overexpression. The culture was incubated at 16 °C, 200 rpm for 18 h and harvested by centrifugation at 13,881 × g for 30 min. Harvested cells were stored at −80 °C before lysis and purification. Harvested cell pellets from overexpression were resuspended in 40 mL lysis buffer [50 mM Tris-HCl pH 7.8, 300 mM NaCl, 10 mM imidazole, and 10% (v/v) glycerol] containing 1 mg/mL lysozyme, 100 units of DNase I, 0.1 mM FAD, and 1 mM phenylmethylsulfonyl fluoride (PMSF), incubated on a rocker at 4 °C for 45 min, and lysed by sonication for 100 s total in cycles of 20 s on and 50 s off. Insoluble material was removed by centrifugation (46,413 × g for 30 min). The clarified lysate was combined with equilibrated Ni-NTA resin (3 mL bed volume) and incubated on a rocker at 4 °C for 1 h. The resin was collected in a gravity-flow column and washed with 50 mL lysis buffer containing 10 to 40 mM imidazole. Enriched His-tagged proteins were eluted from the resin with up to 10 mL lysis buffer containing 400 mM imidazole. The eluted proteins were concentrated and exchanged into storage buffer [50 mM Tris-HCl pH 7.8, 100 mM NaCl, 10% (v/v) glycerol] using a Cytiva PD-10 desalting column, flash frozen with liquid nitrogen, and stored at −80 °C for future use. For the sample preparation for SDS–PAGE, the 400 mM imidazole and storage buffer fractions were diluted 20 and 40 times, respectively, before mixed with 2× Laemmli sample buffer. For the purification of AfoD for crystallography, the clarified cell lysate was first purified by loading onto a 5 mL HisTrap column (GE Healthcare) on a Healthcare ÄKTA FPLC and further purified by size exclusion chromatography (HiLoad 16/60 Superdex S200, GE Healthcare). Total protein concentration was quantified by Pierce™ 660nm protein assay using Pierce™ 660nm Protein Assay Reagent purchased from Thermo Fisher Scientific (see SI Appendix for details).

Oligomeric State Determination of AfoD.

Separations were conducted as described in the respective purification procedures using a Sephacryl S-200 HR gel filtration column. Approximate sizes were determined relative to a calibration curve (Gel Filtration Standards Kit, Sigma). Based on the calibration curve, y = 10(−0.7496(x/39.88) + 3.068), the apparent molecular weight of AfoD was 48 kDa, indicating a monomer in solution (SI Appendix, Fig. S12).

Crystallization, Structure Determination, and Refinement of AfoD.

Diffraction-quality crystals of WT AfoD were grown by sitting-drop vapor diffusion at 20 °C from a solution containing 1 μL protein stock and 1 μL reservoir solution [0.2 M (NH4)2SO4, 0.1 M MES:NaOH pH 6.5, 30% w/v PEG 5,000 MME]. Crystals were flash cooled in liquid nitrogen without additional cryoprotection. Diffraction data were recorded at GM/CA beamline 23-ID-B at the Advanced Photon Source. Raw data were integrated with XDS (61) and scaled in Aimless (62) in the CCP4 suite (62). The structure was solved by molecular replacement using Phaser (63) with flavin-dependent TropB (PDB ID 6NES) (29) as a search model. Refinement was done using Phenix (64) and model building with Coot (65). The final model includes amino acids 10 to 437 in the AfoD monomer. The monomer model also includes one FAD cofactor and 279 water molecules. Figures were made with PyMol (66). The stereochemical quality of the structure was validated with MolProbity (67). See SI Appendix, Table S7, for the crystallographic summary of AfoD.

Flavin Incorporation Assay.

50 μL samples of each protein was mixed with 450 μL storage buffer and centrifuged using 30-kDa MW cutoff centrifuge tubes. The centrifugation step was repeated for three times to remove free flavins in solution. The samples were assayed by Pierce™ 660-nm protein assay and diluted to ~20 μM in storage buffer. An 8 μL aliquot fresh 10% sodium dodecyl sulfate (w/v) in the storage buffer was added to each 200 μL solution and mixed as the denatured group. An 8 μL aliquot of the storage buffer was added to another 200 μL solution and mixed as the control group. The samples were incubated at room temperature for 10 min and transferred to a 96-well plate for the absorbance measurement in a SpectraMax M5 UV-Vis microplate reader using the pathlength check. Absorbance spectra were recorded from 300 to 700 nm in 2-nm increments (SI Appendix, Fig. S8). The absorbance at 450 nm for the denatured enzymes and the extinction coefficient of free FAD (11,300 M1cm−1) were used to calculate the concentration of FAD in each protein sample using Beer’s law. FAD incorporation was determined by the ratio of the FAD concentration to the protein concentration (SI Appendix, Table S4) (68).

Protein Thermal Stability Determined by Differential Scanning Fluorimetry.

Protein samples were diluted to 4,000 μg/mL using MOPS buffer [50 mM 3-(N-morpholino)propanesulfonic acid (MOPS) pH 7.8, 100 mM NaCl, 10% (v/v) glycerol]. SYPRO™ orange protein gel stain (5,000× concentrate in DMSO) was diluted to 500× in MOPS buffer. Then, 0.8 μL 500× dye was mixed with 79.2 μL MOPS buffer as the control group, and 77.2 μL MOPS buffer and 2 μL 4,000 μg/mL protein samples as the test group. The final protein concentration was targeted at 100 μg/mL. Subsequently, 20 μL of the mixtures was aliquoted to PCR tubes in a 96-well rack. Triplicate samples were analyzed by fluorescence in a StepOnePlus™ RT-PCR instrument operated with a standard melt curve mode with ROX as reporter and no quencher. The heating program followed a 2-min hold at 25 °C and a stepwise heating with a 1 °C/min ramp from 25 to 95 °C (SI Appendix, Fig. S9). The melting temperature (Tm) was calculated as the temperature corresponding to the maximum of the first derivative of the normalized fluorescence signal using DSFworld ( (SI Appendix, Table S5) (69).

NADPH Consumption Assay.

A fluorescence detection method for NADPH was developed. The NADPH consumption rates of the samples with only NADPH and NADPH/enzyme mixtures were acquired and compared with the theoretical NADPH consumption rate deduced from the substrate consumption under the condition of the normal oxidative dearomatization reaction (see SI Appendix for details).


A set of 500 sequences was obtained by performing three iterations of PSI-BLAST on the National Center for Biotechnology Information (NCBI) site using the AfoD sequence as a query. Sequences were clustered and filtered using CD-HIT (70) resulting in a set of 276 sequences where no sequence is related to another by over 90% sequence identity. The sequences were initially aligned using MUSCLE (71). Since the bacterial root sequence, PHBH, is distantly related to the fungal species and 3D structural information is available, the PHBH sequence was aligned to the rest of the 276 sequences making use of the structural alignment program MultiSeq (72). MSAs often contain errors, increasing in magnitude as the sequence identity decreases. These errors can then be propagated to the phylogenetic tree and the inferred ancestral protein sequences. Thus, we utilized the program MEME (73), an expectation maximization algorithm, to identify recurring, fixed length patterns (motifs) in unaligned sequences that can then be visualized in GeneDoc to make several minor, yet crucial, corrections to the MSA (74, 75). The MSA and the identified motifs are useful in their own right when annotated on a representative crystal structure revealing the degree of conservation in critical areas such as the active site and cofactor-binding sites. The ML phylogenetic tree was constructed using a trimmed alignment that consisted primarily of the top-15 scoring motifs described above, the LG (76) +I+Γ8 substitution model (I = invariant, Γ8 = 8 rate categories approximating a gamma distribution) identified by ProtTest (77, 78) as the most appropriate evolutionary model. Ten starting phylogenetic trees were constructed using a different random seed. The tree with the best ML score was used for subtree pruning and regrafting moves implemented in the PhyML 3.0 package (79). Branch supports were estimated using the approximate likelihood ratio test.
The ML ancestral flavin-dependent monooxygenase sequences were reconstructed with the complete multiple sequence alignment and ML phylogenetic tree as input to PAML version 4.8 (80). Within insertion/deletion regions, the maximum parsimony Fitch algorithm was used to determine whether a gap should be placed at a specific position. The “AltAll” procedure (57) was used to infer an alternative ancestral sequence. This procedure replaces the most likely residue with the second most likely residue when the posterior probability difference between the two possibilities is 0.2 or less. Classifying the AltAll sequence establishes the robustness of the ASR.

Model Substrate pKa Determination.

UV-Vis spectra of the model substrate 10 at different pH were acquired to determine the pKa of the substrate depending on the property that the protonated and deprotonated substrates (phenol and phenolate) have different absorption. The model substrate was dissolved in 50 mM potassium phosphate buffers from pH 4.6 to 9.7. The absorbance of the phenol at 292 nm and the phenolate at 336 nm was plotted against the pH, and the data were fitted to sigmoidal function to calculate the pKa of the model substrate (SI Appendix, Figs. S14 and S15).

Analytical-Scale Enzymatic Reactions.

Analytical-scale reactions were performed using either purified enzymes or clarified cell lysates. For reactions using the purified enzymes, each reaction contained 10 μM enzyme, 50 mM potassium phosphate (KPi) buffer pH 8.0, 1.0 (or 2.5) mM substrate 10, 0.4 (or 1.0) mM NADP+, 0.8 (or 2.0) U/mL glucose-6-phosphate dehydrogenase (G6PDH), 2.0 (or 5.0) mM glucose-6-phosphate (G6P), and deionized water to a final volume of 75 (or 100) μL. Reactions were carried out at 30 °C for 1 h and quenched with the addition of 225 (or 300) μL methanol containing 5 mM pentamethylbenzene as an internal standard (IS). For reactions using the clarified cell lysates, cell pellets were thawed after taken out from a −80 °C freezer. Three more freeze–thaw cycles were applied to induce cell lysis. The cell pellets in each well were resuspended by 450 μL lysis buffer [50 mM Tris-HCl pH 7.8, 300 mM NaCl, and 10% (v/v) glycerol] containing 1 mg/mL lysozyme, 100 units DNase I, and 1 mM PMSF, and were incubated at 25 °C for 2 h. The cell suspensions were centrifuged at 2,000 rpm at 4 °C for 20 min before use. Each reaction contained 50 μL cell lysates, 50 mM KPi buffer pH 8.0 (3.75 μL, 1 M stock), 1.0 mM substrate (1.5 μL of a 50 mM stock solution in DMSO), 0.4 mM NADP+ (0.3 μL, 100 mM), 0.8 U/mL G6PDH (0.6 μL, 100 U/mL), 2.0 mM G6P (0.3 μL, 500 mM), 2.25 μL DMSO, and deionized water to a final volume of 75 μL. Reactions were carried out at 30 °C for 1 h and quenched with the addition of 225 μL methanol containing 5 mM pentamethylbenzene as an IS. Quenched reaction mixtures were filtered through Pall AcroPrep Advance 350 μL 0.2 um PTFE 96-well filter plates by centrifugation at 2,000 rpm for 15 min, and the filtrates were analyzed by validated methods on UPLC-DAD-CD (SI Appendix, Figs. S16–S20 and Tables S8–S12). Percentage conversions were quantified by comparison of the ratio of Area@300 nmsubstrate/Area@273 nmIS of each enzyme to that of the no enzyme control. The enantioselectivity was determined by peaks of positive or negative activity at 230 nm and was quantified by comparison of the ratio of Height@230 nmproduct/Area@300 nmproduct to that of AzaH or AfoD as 99.9% (see SI Appendix for details).

Protein Structural Model Generation and Substrate Docking.

Standard AlphaFold (51) pipeline using a full-size genetic database and a default number of template hits was conducted to generate structural models for given input sequences. The model generation step used the monomer model preset and standard Amber relaxation constants, and the top ranked AlphaFold model out of the five generated models was used. The AlphaFold models were then superposed using TM-align (81) with QM/MM (DFTB3/MM) refined chain A of RCSB PDB 6NES docked with 3-methylorcinaldeyde (3MO) (29). The superposed structures were represented with the CHARMM36 (82) general forcefield. The structures were then minimized in vacuum with harmonic restraints on all heavy atoms and 1,000 steps of steepest descent. Next the FAD cofactor present in the QM/MM (DFTB3/MM) relaxed structure of TropB was added to the models, with FAD parameterized with the CGenFF force field (83). The model structure with FAD had harmonic restraints placed on all backbone heavy atoms and was minimized for 200 steps using the steepest descents minimizer in CHARMM (82). Next all heavy atoms except FAD and side chains within 5 Å were minimized with 200 steps of steepest descent. Lastly all heavy atoms excluding FAD had harmonic restraints and 200 steps of steepest descent minimization followed by 1,000 steps of Adopted Basis Newton Raphson (ABNR) minimization. All minimizations had a TOLENR of 0.001 and all harmonic restraints used a force constant of 50 kcal/molÅ2 . In-house FAD-incorporated AfoD and AzaH models were generated and compared with their entries in AlphaFold structure database (SI Appendix, Table S21). The models are freely available at
Ligands used in the assays were docked into the structural models generated as described above using FFTDock (58). The grid for FFTDOCK was centered at the average coordinates of 3MO docked in QM/MM refined structure of TropB using flexible CDOCKER (29). The grid max length was 6 Å creating a cube with sides 12 Å. The grid was generated with soft potentials, with an EMAX of 2 kcal/mol, a MINE of −20 kcal/mol, MAXE of 40 kcal/mol. Probe atoms with van der Waals radii ranging from 0.225 to 2.300 angstroms, grid spacing of 0.5 angstroms, distance-dependent dielectric constant of 3, and protein and FAD atoms were used in grid generation. For docking with the grid, the ligand protomer was represented with the CGenFF forcefield (83), and a set of 36,000 quaternions was used in the rotational search. From FFTDock, the top 500 docked poses were used for protein-based minimization. Protein-based minimization used the explicit all atom representation of the model and FAD cofactor with 50 steps of SD followed by 1,000 steps of ABNR and TOLENR of 0.001, and with a distance-dependent dielectric of 1. The lowest energy protein minimized pose is chosen as the docked conformation.

Data, Materials, and Software Availability

AfoD and AzaH models data have been deposited in Github ( (84). All other data are included in the manuscript and/or SI Appendix.


This research was supported by funds from the University of Michigan Life Sciences Institute, the University of Michigan Department of Chemistry, the NIH R35 GM130587 (C.L.B.), R35 GM124880 (A.R.H.N.), and R01 DK042303 (J.L.S.), and the Margaret J. Hunter Professorship (J.L.S.). C.-H.C. was supported by the Rackham International Student Fellowship/Chia-Lun Lo Fellowship. A.R.B. was supported by the NIH Chemistry Biology Interface Training Grant (T32 GM008597). This work was supported by the U.S. Department of Energy Joint Genome Institute (, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy operated under Contract No. DE-AC02-05CH11231. We thank Prof. Yi Tang from the University of California, Los Angeles, for providing a plasmid containing azaH, Dr. Joshua Pyser and Dr. Ye Wang for assistance with the synthesis of substrates, and Tessa Epstein for help with the acquisition of protein thermostability data using differential scanning fluorimetry.

Author contributions

C.-H.C., J.L.S., C.L.B., and A.R.H.N. designed research; C.-H.C., T.W., A.R.B., and A.H. performed research; C.-H.C., T.W., A.R.B., A.H., J.L.S., C.L.B., and A.R.H.N. analyzed data; and C.-H.C. and A.R.H.N. wrote the paper.

Competing interests

The authors declare no competing interest.

Supporting Information

Appendix 01 (PDF)


H. Yu et al., Azaphilone derivatives from the fungus Coniella fragariae inhibit NF-κB activation and reduce tumor cell migration. J. Nat. Prod. 81, 2493–2500 (2018).
K. Matsuzaki et al., New brominated and halogen-less derivatives and structure-activity relationship of azaphilones inhibiting gp120-CD4 binding. J. Antibiot. (Tokyo) 51, 1004–1011 (1998).
D. Chen et al., Sclerotiorin inhibits protein kinase G from Mycobacterium tuberculosis and impairs mycobacterial growth in macrophages. Tuberculosis (Edinb) 103, 37–43 (2017).
J. L. Tang et al., Azaphilone alkaloids with anti-inflammatory activity from fungus Penicillium sclerotiorum cib-411. J. Agric. Food Chem. 67, 2175–2182 (2019).
J. M. Gao, S. X. Yang, J. C. Qin, Azaphilones: Chemistry and biology. Chem. Rev. 113, 4755–4811 (2013).
C. Chen et al., Recent advances in the chemistry and biology of azaphilones. RSC Adv. 10, 10197–10220 (2020).
A. A. Stierle et al., Azaphilones from an acid mine extremophile strain of a Pleurostomophora sp. J. Nat. Prod. 78, 2917–2923 (2015).
E. Kuhnert et al., Lenormandins A—G, new azaphilones from Hypoxylon lenormandii and Hypoxylon jaklitschii sp. nov., recognised by chemotaxonomic data. Fungal Divers. 71, 165–184 (2015).
S. Yi, S. Wei, Q. Wu, H. Wang, Z. J. Yao, Azaphilones as activation-free primary-amine-specific bioconjugation reagents for peptides, proteins and lipids. Angew. Chem. Int. Ed. Engl. 61, e202111783 (2022).
K. Kaur et al., The fungal natural product azaphilone-9 binds to HuR and inhibits HuR-RNA interaction in vitro. PLoS One 12, e0175471 (2017).
J. Zhu, N. P. Grigoriadis, J. P. Lee, J. A. Porco Jr., Synthesis of the azaphilones using copper-mediated enantioselective oxidative dearomatization. J. Am. Chem. Soc. 127, 9342–9343 (2005).
S. P. Roche, J. A. Porco Jr., Dearomatization strategies in the synthesis of complex natural products. Angew. Chem. Int. Ed. Engl. 50, 4068–4093 (2011).
S. A. B. Dockrey, A. L. Lukowski, M. R. Becker, A. R. H. Narayan, Biocatalytic site- and enantioselective oxidative dearomatization of phenols. Nat. Chem. 10, 119–125 (2018).
J. Zhu, A. R. Germain, J. A. Porco Jr., Synthesis of azaphilones and related molecules by employing cycloisomerization of o-alkynylbenzaldehydes. Angew. Chem. Int. Ed. Engl. 43, 1239–1243 (2004).
M. Makrerougras et al., Total synthesis and structural revision of chaetoviridins A. Org. Lett. 19, 4146–4149 (2017).
H. Kang, C. Torruellas, J. Liu, M. C. Kozlowski, Total synthesis of chaetoglobin A via catalytic, atroposelective oxidative phenol coupling. Org. Lett. 20, 5554–5558 (2018).
R. Chong, R. W. Gray, R. R. King, W. B. Whalley, The synthesis of (±) mitorubrin. J. Chem. Soc. D: Chem. Commun., 101a (1970).
K. C. Nicolaou et al., Biomimetic total synthesis of bisorbicillinol, bisorbibutenolide, trichodimerol, and designed analogues of the bisorbicillinoids. J. Am. Chem. Soc. 122, 3071–3079 (2000).
M. M. Huijbers, S. Montersino, A. H. Westphal, D. Tischler, W. J. van Berkel, Flavin dependent monooxygenases. Arch. Biochem. Biophys. 544, 2–17 (2014).
E. D. Amato, J. D. Stewart, Applications of protein engineering to members of the old yellow enzyme family. Biotechnol. Adv. 33, 624–631 (2015).
S. K. Padhi, D. J. Bougioukou, J. D. Stewart, Site-saturation mutagenesis of tryptophan 116 of Saccharomyces pastorianus old yellow enzyme uncovers stereocomplementary variants. J. Am. Chem. Soc. 131, 3271–3280 (2009).
M. Hall, C. Stueckler, W. Kroutil, P. Macheroux, K. Faber, Asymmetric bioreduction of activated alkenes using cloned 12-oxophytodienoate reductase isoenzymes OPR-1 and OPR-3 from Lycopersicon esculentum (tomato): A striking change of stereoselectivity. Angew. Chem. Int. Ed. Engl. 46, 3934–3937 (2007).
J. B. Pyser et al., Stereodivergent, chemoenzymatic synthesis of azaphilone natural products. J. Am. Chem. Soc. 141, 18551–18559 (2019).
J. Davison et al., Genetic, molecular, and biochemical basis of fungal tropolone biosynthesis. Proc. Natl. Acad. Sci. U. S. A. 109, 7642–7647 (2012).
A. Abood et al., Kinetic characterisation of the FAD dependent monooxygenase TropB and investigation of its biotransformation potential. RSC Adv. 5, 49987–49995 (2015).
A. O. Zabala, W. Xu, Y. H. Chooi, Y. Tang, Characterization of a silent azaphilone gene cluster from Aspergillus niger ATCC 1015 reveals a hydroxylation-mediated pyran-ring formation. Chem. Biol. 19, 1049–1059 (2012).
Y. M. Chiang et al., A gene cluster containing two fungal polyketide synthases encodes the biosynthetic pathway for a polyketide, asperfuranone, in Aspergillus nidulans. J. Am. Chem. Soc. 131, 2965–2970 (2009).
S. A. B. Dockrey et al., Positioning-group-enabled biocatalytic oxidative dearomatization. ACS Cent. Sci. 5, 1010–1016 (2019).
A. R. Benitez et al., Structural basis for selectivity in flavin-dependent monooxygenase-catalyzed oxidative dearomatization. ACS Catal. 9, 3633–3640 (2019).
M. J. Harms, J. W. Thornton, Analyzing protein structure and function using ancestral gene reconstruction. Curr. Opin. Struct. Biol. 20, 360–366 (2010).
G. K. A. Hochberg, J. W. Thornton, Reconstructing ancient proteins to understand the causes of structure and function. Annu. Rev. Biophys. 46, 247–269 (2017).
J. W. Thornton, Resurrecting ancient genes: Experimental analysis of extinct molecules. Nat. Rev. Genet. 5, 366–375 (2004).
R. Merkl, R. Sterner, Ancestral protein reconstruction: Techniques and applications. Biol. Chem. 397, 1–21 (2016).
T. Zou, V. A. Risso, J. A. Gavira, J. M. Sanchez-Ruiz, S. B. Ozkan, Evolution of conformational dynamics determines the conversion of a promiscuous generalist into a specialist enzyme. Mol. Biol. Evol. 32, 132–143 (2015).
T. Devamani et al., Catalytic promiscuity of ancestral esterases and hydroxynitrile lyases. J. Am. Chem. Soc. 138, 1046–1056 (2016).
V. A. Risso, J. M. Sanchez-Ruiz, S. B. Ozkan, Biotechnological and protein-engineering implications of ancestral protein resurrection. Curr. Opin. Struct. Biol. 51, 106–115 (2018).
L. C. Wheeler, S. A. Lim, S. Marqusee, M. J. Harms, The thermostability and specificity of ancient proteins. Curr. Opin. Struct. Biol. 38, 37–43 (2016).
Y. Gumulya et al., Engineering highly functional thermostable proteins using ancestral sequence reconstruction. Nat. Catal. 1, 878–888 (2018).
K. Schriever et al., Engineering of ancestors as a tool to elucidate structure, mechanism, and specificity of extant terpene cyclase. J. Am. Chem. Soc. 143, 3794–3807 (2021).
C. R. Nicoll et al., Ancestral-sequence reconstruction unveils the structural basis of function in mammalian FMOs. Nat. Struct. Mol. Biol. 27, 14–24 (2020).
T. N. Starr, J. W. Thornton, Epistasis in protein evolution. Protein Sci. 25, 1204–1218 (2016).
C. M. Miton, N. Tokuriki, How mutational epistasis impairs predictability in protein evolution and design. Protein Sci. 25, 1260–1272 (2016).
F. Baier et al., Cryptic genetic variation shapes the adaptive evolutionary potential of enzymes. eLife 8 (2019).
J. T. Bridgham, S. M. Carroll, J. W. Thornton, Evolution of hormone-receptor complexity by molecular exploitation. Science 312, 97–101 (2006).
J. I. Boucher, J. R. Jacobowitz, B. C. Beckett, S. Classen, D. L. Theobald, An atomic-resolution view of neofunctionalization in the evolution of apicomplexan lactate dehydrogenases. eLife 3, e02304 (2014).
J. A. Ugalde, B. S. Chang, M. V. Matz, Evolution of coral pigments recreated. Science 305, 1433 (2004).
K. M. Hart et al., Thermodynamic system drift in protein evolution. PLoS Biol. 12, e1001994 (2014).
D. Ribeaucourt et al., Tunable production of (R)- or (S)-citronellal from geraniol via a bienzymatic cascade using a copper radical alcohol oxidase and old yellow enzyme. ACS Catal. 12, 1111–1116 (2022).
X. Li, R. Shimaya, T. Dairi, W. C. Chang, Y. Ogasawara, Identification of cyclopropane formation in the biosyntheses of hormaomycins and belactosins: Sequential nitration and cyclopropanation by metalloenzymes. Angew. Chem. Int. Ed. Engl. 61, e202113189 (2022).
S. Shimo et al., Stereodivergent nitrocyclopropane formation during biosynthesis of belactosins and hormaomycins. J. Am. Chem. Soc. 143, 18413–18418 (2021).
J. Jumper et al., Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
G. R. Moran, B. Entsch, B. A. Palfey, D. P. Ballou, Evidence for flavin movement in the function of p-hydroxybenzoate hydroxylase from studies of the mutant Arg220Lys. Biochemistry 35, 9278–9285 (1996).
J. A. Gerlt, P. C. Babbitt, Enzyme (re)design: Lessons from natural evolution and computation. Curr. Opin. Struct. Biol. 13, 10–18 (2009).
B. Entsch, W. J. van Berkel, Structure and mechanism of para-hydroxybenzoate hydroxylase. FASEB J. 9, 476–483 (1995).
Z. Yang, S. Kumar, M. Nei, A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141, 1641–1650 (1995).
J. M. Koshi, R. A. Goldstein, Probabilistic reconstruction of ancestral protein sequences. J. Mol. Evol. 42, 313–320 (1996).
V. Hanson-Smith, B. Kolaczkowski, J. W. Thornton, Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol. Biol. Evol. 27, 1988–1999 (2010).
X. Ding, Y. Wu, Y. Wang, J. Z. Vilseck, C. L. Brooks III, Accelerated CDOCKER with GPUs, parallel simulated annealing, and fast Fourier transforms. J. Chem. Theory Comput. 16, 3910–3919 (2020).
J. D. Bloom, S. T. Labthavikul, C. R. Otey, F. H. Arnold, Protein stability promotes evolvability. Proc. Natl. Acad. Sci. U.S.A. 103, 5869–5874 (2006).
L. Ridder, A. J. Mulholland, I. M. C. M. Rietjens, J. Vervoort, A quantum mechanical/molecular mechanical study of the hydroxylation of phenol and halogenated derivatives by phenol hydroxylase. J. Am. Chem. Soc. 122, 8728–8738 (2000).
W. Kabsch, XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 (2010).
M. D. Winn et al., Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235–242 (2011).
A. J. McCoy et al., Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).
D. Liebschner et al., Macromolecular structure determination using X-rays, neutrons and electrons: Recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861–877 (2019).
P. Emsley, K. Cowtan, Coot: Model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004).
L. Schrödinger, The PyMOL Molecular Graphics System (Version 2.0).
V. B. Chen et al., MolProbity: All-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010).
A. Aliverti, B. Curti, M. A. Vanoni, Identifying and quantitating FAD and FMN in simple and in iron-sulfur-containing flavoproteins. Methods Mol. Biol. 131, 9–23 (1999).
T. Wu et al., Three essential resources to improve differential scanning fluorimetry (DSF) experiments (2022). Accessed 22 September 2020.
Y. Huang, B. Niu, Y. Gao, L. Fu, W. Li, CD-HIT Suite: A web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682 (2010).
R. C. Edgar, MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
E. Roberts, J. Eargle, D. Wright, Z. Luthey-Schulten, MultiSeq: Unifying sequence and structure data for evolutionary analysis. BMC Bioinformatics 7, 382 (2006).
T. L. Bailey, C. Elkan, Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).
A. Szarecka, K. R. Lesnock, C. A. Ramirez-Mondragon, H. B. Nicholas Jr., T. Wymore, The Class D beta-lactamase family: Residues governing the maintenance and diversity of function. Protein Eng. Des. Sel. 24, 801–809 (2011).
T. Wymore, B. Y. Chen, H. B. Nicholas Jr., A. J. Ropelewski, C. L. Brooks III, A mechanism for evolving novel plant sesquiterpene synthase function. Mol. Inform. 30, 896–906 (2011).
S. Q. Le, O. Gascuel, An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).
F. Abascal, R. Zardoya, D. Posada, ProtTest: Selection of best-fit models of protein evolution. Bioinformatics 21, 2104–2105 (2005).
D. Darriba, G. L. Taboada, R. Doallo, D. Posada, ProtTest 3: Fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164–1165 (2011).
W. Hordijk, O. Gascuel, Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21, 4338–4347 (2005).
Z. Yang, PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Y. Zhang, J. Skolnick, TM-align: A protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).
J. Huang et al., CHARMM36m: An improved force field for folded and intrinsically disordered proteins. Nat. Methods 14, 71–73 (2017).
K. Vanommeslaeghe et al., CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690 (2010).
C.-H. Chiang, PNAS_model. GitHub. Deposited 27 January 2023.

Information & Authors


Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 120 | No. 15
April 11, 2023
PubMed: 37014851


Data, Materials, and Software Availability

AfoD and AzaH models data have been deposited in Github ( (84). All other data are included in the manuscript and/or SI Appendix.

Submission history

Received: October 25, 2022
Accepted: March 6, 2023
Published online: April 4, 2023
Published in issue: April 11, 2023


  1. flavin-dependent monooxygenases
  2. biocatalysis
  3. ancestral sequence reconstruction
  4. oxidative dearomatization


This research was supported by funds from the University of Michigan Life Sciences Institute, the University of Michigan Department of Chemistry, the NIH R35 GM130587 (C.L.B.), R35 GM124880 (A.R.H.N.), and R01 DK042303 (J.L.S.), and the Margaret J. Hunter Professorship (J.L.S.). C.-H.C. was supported by the Rackham International Student Fellowship/Chia-Lun Lo Fellowship. A.R.B. was supported by the NIH Chemistry Biology Interface Training Grant (T32 GM008597). This work was supported by the U.S. Department of Energy Joint Genome Institute (, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy operated under Contract No. DE-AC02-05CH11231. We thank Prof. Yi Tang from the University of California, Los Angeles, for providing a plasmid containing azaH, Dr. Joshua Pyser and Dr. Ye Wang for assistance with the synthesis of substrates, and Tessa Epstein for help with the acquisition of protein thermostability data using differential scanning fluorimetry.
Author Contributions
C.-H.C., J.L.S., C.L.B., and A.R.H.N. designed research; C.-H.C., T.W., A.R.B., and A.H. performed research; C.-H.C., T.W., A.R.B., A.H., J.L.S., C.L.B., and A.R.H.N. analyzed data; and C.-H.C. and A.R.H.N. wrote the paper.
Competing Interests
The authors declare no competing interest.


This article is a PNAS Direct Submission.



Department of Chemistry, University of Michigan, Ann Arbor, MI 48109
Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794
Department of Chemistry, Stony Brook University, Stony Brook, NY 11794
Attabey Rodríguez Benítez
Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109
Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
Azam Hussain
Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor, MI 48109
Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109
Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109
Department of Chemistry, University of Michigan, Ann Arbor, MI 48109
Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109
Department of Biophysics, University of Michigan, Ann Arbor, MI 48109
Department of Chemistry, University of Michigan, Ann Arbor, MI 48109
Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109
Program in Chemical Biology, University of Michigan, Ann Arbor, MI 48109


To whom correspondence may be addressed. Email: [email protected].

Metrics & Citations


Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

View options

PDF format

Download this article as a PDF file


Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Personal login Institutional Login

Recommend to a librarian

Recommend PNAS to a Librarian

Purchase options

Purchase this article to get full access to it.

Single Article Purchase

Deciphering the evolution of flavin-dependent monooxygenase stereoselectivity using ancestral sequence reconstruction
Proceedings of the National Academy of Sciences
  • Vol. 120
  • No. 15







Share article link

Share on social media