A protein constructed de novo enables cell growth by altering gene regulation

Edited by Richard E. Lenski, Michigan State University, East Lansing, MI, and approved January 29, 2016 (received for review January 14, 2016)
February 16, 2016
113 (9) 2400-2405


We describe a novel regulatory protein that was discovered in a library of de novo-designed proteins never sampled by nature. This de novo protein, SynSerB3, rescues a conditionally lethal mutation in Escherichia coli cells that prevents growth in the absence of serine. We found that SynSerB3 does not catalyze the biosynthesis of serine but rather upregulates an endogenous protein, histidinol phosphate phosphatase, which can synthesize serine via a promiscuous catalytic activity. This regulatory function of SynSerB3 sustains life in E. coli cells under conditions that are otherwise lethal and is a synthetic addition to natural gene regulation.


Recent advances in protein design rely on rational and computational approaches to create novel sequences that fold and function. In contrast, natural systems selected functional proteins without any design a priori. In an attempt to mimic nature, we used large libraries of novel sequences and selected for functional proteins that rescue Escherichia coli cells in which a conditionally essential gene has been deleted. In this way, the de novo protein SynSerB3 was selected as a rescuer of cells in which serB, which encodes phosphoserine phosphatase, an enzyme essential for serine biosynthesis, was deleted. However, SynSerB3 does not rescue the deleted activity by catalyzing hydrolysis of phosphoserine. Instead, SynSerB3 up-regulates hisB, a gene encoding histidinol phosphate phosphatase. This endogenous E. coli phosphatase has promiscuous activity that, when overexpressed, compensates for the deletion of phosphoserine phosphatase. Thus, the de novo protein SynSerB3 rescues the deletion of serB by altering the natural regulation of the His operon.
One of the key goals of synthetic biology is to design novel proteins that fold and function in vivo. A particularly challenging objective would be to produce nonnatural proteins that do not merely generate interesting phenotypes but actually provide essential functions necessary for the growth of living cells. The successful design of such life-sustaining proteins would represent an initial step toward constructing artificial “proteomes” of nonnatural sequences.
We initiated work toward artificial proteomes by constructing large combinatorial libraries of novel sequences designed to fold into stable four-helix structures (1, 2). Our libraries were based on a strategy for protein design, which assumes that the overall fold of a simple structure can be specified by the pattern of polar and nonpolar residues in the linear sequence. Because only the type of residue—polar versus nonpolar—is specified, this strategy has been called a “binary code” for protein design (13). At the same time, because the exact identities of the side chains at each polar and nonpolar position are not specified explicitly, this strategy is well suited for constructing large combinatorial libraries of novel sequences (3). To express these libraries of binary-patterned sequences in vivo, we construct collections of synthetic genes using degenerate DNA codons. For example, the degenerate codon NTN (N = A,T,C,G) is used to encode five nonpolar residues, and the degenerate codon VAN (V = A,C,G) is used to encode six polar residues.
We have shown previously that several proteins from these binary patterned libraries fold into stable four-helix bundles, and both crystallographic and NMR structures have been determined (46). Moreover, in initial steps probing the potential for functional activity, proteins from these libraries were shown to bind small molecules, including cofactors, and to catalyze rudimentary reactions (7, 8). Although those experiments screened a subset of library proteins for activity in vitro, they did not probe for function in vivo.
To mount an unbiased search for proteins that provide life-sustaining functions in vivo, we used life-or-death selections. Specifically, we selected for proteins that rescue the deletion of conditionally essential genes in E. coli. We transformed a library of 1.5 × 106 binary patterned de novo sequences into strains of E. coli that contain deletions of a gene encoding a protein required for survival on minimal medium (9). Such strains, called auxotrophs, grow on rich medium but are unable to grow on minimal medium because a protein involved in the biosynthesis or absorption of an essential metabolite has been deleted. Our initial studies tested numerous auxotrophic strains from the Keio collection, which contains all viable single-deletion strains of E. coli (10). Although most of the auxotrophs were not rescued by proteins from our library, we found four auxotrophic strains that were reproducibly rescued by novel sequences from our binary-patterned library (9).
This report describes the mechanism by which one of our novel proteins supports cell growth. We show that the de novo protein SynSerB3 rescues the deletion of the native E. coli SerB (which encodes phosphoserine phosphatase), not by catalyzing the phosphoserine phosphatase reaction but rather by performing a regulatory function. [This sequence, first isolated by Fisher et al. (9), was named “SynSerB3” because it was the third sequence isolated in a selection for synthetic sequences that rescue the ΔserB auxotroph.] SynSerB3 causes elevated expression of HisB, which encodes histidinol phosphate phosphatase. This phosphatase has promiscuous activity and, when expressed at high levels, is capable of hydrolyzing enough phosphoserine to enable the growth on minimal medium of otherwise moribund ΔserB cells. These results demonstrate that a nonnatural de novo protein enables cell growth by altering gene regulation.


The de Novo Protein SynSerB3 Enables Growth of ΔserB Cells on Minimal Medium.

The serB gene in E. coli encodes phosphoserine phosphatase, which catalyzes the final step in serine biosynthesis (Fig. 1A). Because serine is an essential metabolite, ΔserB cells are auxotrophs and cannot grow on minimal medium. We reported previously that several de novo proteins from our binary patterned libraries rescue the serB deletion on minimal medium (9). We chose one of these proteins, SynSerB3, for detailed characterization because it rescues ΔserB in fewer days than the other SynSerB proteins.
Fig. 1.
SynSerB3 enables the growth of ΔserB cells in minimal medium. (A) The phosphoserine phosphatase reaction catalyzed by the enzyme encoded by serB. (B) An overview of the auxotroph screen: A strain of E. coli in which serB is deleted cannot grow on minimal medium. A plasmid encoding the negative control (LacZ) fails to support growth, whereas both the native E. coli SerB (positive control), and the de novo protein SynSerB3 support growth on minimal medium. (C) The de novo protein SynSerB3 allows the auxotrophic strain ΔserB to grow in liquid minimal medium. Strains expressing the natural E. coli enzyme encoded by serB grow more rapidly, whereas cells expressing the control protein LacZ failed to grow over the course of 6 d.
Before setting out to determine the mechanism of rescue, we sought to confirm that SynSerB3 is indeed responsible for the rescue of the ΔserB auxotroph. As depicted in Fig. 1B, ΔserB cells were transformed with plasmids expressing SynSerB3, LacZ (negative control), or WT E. coli SerB (positive control). Transformants were plated on M9/glucose minimal plates containing isopropyl-β-d-1thiogalactopyranoside (IPTG) to induce expression. As expected, ΔserB cells expressing LacZ failed to grow, even after 14 d, whereas those expressing the native E. coli SerB formed colonies in 2 d. The same cells expressing SynSerB3 formed colonies in 4 d, confirming that SynSerB3 rescues the ΔserB deletion. Not surprisingly, growth sustained by the de novo protein is slower than that of the natural protein, which had been selected by billions of years of evolution.
To confirm the results observed on plates and to quantify the growth rates of the rescued cells more accurately, we assayed growth in liquid cultures. As shown in Fig. 1C, the controls produce the expected growth curves in minimal medium: ΔserB cells expressing the natural SerB gene showed exponential growth with a relatively short lag time, and cells expressing LacZ did not grow at all. As expected from our results on solid medium, cells expressing the de novo protein SynSerB3 grew in minimal medium but did not grow as rapidly as cells expressing the native E. coli protein.
To further confirm that the sequence of SynSerB3 is responsible for the rescue, we performed following additional experiments:
The DNA fragment encoding SynSerB3 was recloned into the expression vector and retransformed in ΔserB cells to confirm that the SynSerB3 sequence, and no other sequences on the plasmid or in the host strain, is responsible for rescue.
Many other de novo sequences from our binary patterned libraries were shown to be unable to rescue ΔserB, thereby demonstrating that expression of an arbitrary binary patterned protein does not induce a generic response responsible for rescue.
One particular binary patterned protein, SynGltA, which rescues the deletion of citrate synthase (essential for glutamate biosynthesis) was used as a control for many experiments in this study and was shown to be unable to rescue ΔserB.
Single base changes causing either stop codons or frameshifts were introduced into SynSerB3 and were shown to knock out activity, thereby demonstrating that expression of the SynSerB3 protein, not merely its mRNA, is required for rescue.
A codon-optimized gene encoding the SynSerB3 amino acid sequence was synthesized and shown to rescue ΔserB cells with the same growth rate as the original SynSerB3 gene, further confirming that the SynSerB3 protein, not its mRNA, mediates rescue.
Single amino acid changes introduced into SynSerB3 were shown to prevent rescue, thereby demonstrating that the phenotype conferred by SynSerB3 depends upon its exact amino acid sequence.

The SynSerB3 Protein Has No Detectable Phosphoserine Phosphatase Activity.

In principle SynSerB3 could rescue ΔserB cells either by enabling the conversion of phosphoserine to serine (the reaction deleted by ΔserB) or by facilitating a novel pathway that bypasses this step. If SynSerB3 facilitated a bypass pathway, it would be expected to rescue the deletion of other enzymes in the serine biosynthesis pathway. However, as reported by Fisher et al. (9), SynSerB3 does not rescue ΔserC, which encodes the enzyme that catalyzes the step before SerB. Thus, SynSerB3 does not bypass the natural serine biosynthesis pathway; it rescues ΔserB cells by enabling the same step catalyzed by the deleted enzyme, phosphoserine phosphatase.
Because expression of SynSerB3 enables the conversion of phosphoserine to serine, we began our studies by testing whether the de novo protein accomplished this conversion by direct action, i.e., by catalyzing this reaction. We purified the SynSerB3 protein using affinity chromatography followed by size-exclusion chromatography (SI Materials and Methods). To ensure there was no possibility of contamination by the natural E. coli phosphoserine phosphatase enzyme, all purifications were done from ΔserB cells. The purified SynSerB3 protein was incubated with phosphoserine in a variety of buffers, and liberation of phosphate was assayed using a standard malachite green assay (11). The positive control, SerB from E. coli, showed high levels of activity. However, the de novo SynSerB3 protein displayed no detectable activity under any of the conditions tested.
Because the purified SynSerB3 protein was not enzymatically active, we considered the possibility that SynSerB3 might require an endogenous molecular partner—either another protein or a cofactor —to catalyze the reaction. To test this possibility, we assayed for phosphoserine phosphatase activity in cell lysates. Because the malachite green assay is unreliable in lysates, we used 13C NMR to assay the reaction. As expected, lysates containing the positive control protein, native E. coli SerB, showed rapid conversion of phosphoserine to serine. Unfortunately, the negative control containing LacZ also showed low levels of activity. We attribute this activity to nonspecific E. coli phosphatases in the lysate. For example, alkaline phosphatase can catalyze the phosphoserine phosphatase reaction but does not rescue ΔserB in vivo because it resides in the periplasm. Lysates from cells expressing SynSerB3 showed activity similar to those from cells expressing LacZ, and we presume this activity is also caused by endogenous nonspecific phosphatases in the lysate. Not surprisingly, all lysates also showed the eventual disappearance of serine, which can be attributed to downstream metabolic processes that use serine to synthesize other metabolites (Fig. S1) (12).
Fig. S1.
(A) 13C NMR time course of lysates shows the appearance and then the disappearance, of serine over the course of 42 h. This particular series is a cell lysate of ΔserB cells expressing SynSerB3 incubated with 100 mM phosphoserine in buffer containing 25 mM Tris (pH 7.4), 150 mM NaCl, 5 mM MgCl2, 3 mM EDTA, and 1 mM DTT. (B) Data extracted from multiple such time courses by integrating the serine and phosphoserine peaks and dividing the relative areas.

SynSerB3 Alters Gene-Expression Profiles in E. coli.

Our finding that purified SynSerB3 is not active as a phosphoserine phosphatase suggests that this de novo protein rescues ΔserB cells by an indirect mechanism involving regulation and/or activation of endogenous E. coli genes or proteins. To probe for altered gene regulation in a model-independent and unbiased way, we performed quantitative RNA sequencing (RNAseq). RNAseq profiles of cells expressing SynSerB3 were compared with those of controls expressing WT E. coli SerB from the same vector. These experiments were performed in two cellular backgrounds, the ΔserB auxotroph and the pseudo-WT Keio parent strain, BW25113. Cells were grown in minimal medium, and samples were collected during mid-logarithmic growth. See Fig. S2 for all-sample quality control.
Fig. S2.
RNAseq quality-control plots showing the similarity of replicates. A principal components analysis plot (A) and a Spearman correlation matrix (B) are shown for the samples that were run in this analysis. “Keio” refers to the BW25113 pseudo-WT strain used in this study.
In the parental (nondeletion) strain BW25113 we observed dramatic differences between cells expressing SynSerB3 and control cells expressing E. coli SerB. Because the chromosome of this pseudo-WT strain encodes all the genes required for growth in minimal medium, the observed differences must be caused by the expression of different proteins on the plasmid. Analysis of the RNAseq data revealed that BW25113 cells expressing the de novo protein SynSerB3 contain 10-fold higher levels of mRNAs from the histidine biosynthetic operon (Fig. 2A). Expression of SynSerB3 also led to the down-regulation of aromatic amino acid biosynthetic genes (Dataset S1, table 2). [When comparing cells expressing SynSerB3 with those expressing E. coli SerB, it can be difficult to determine whether one protein causes down-regulation or the other causes up-regulation. In the current example, we state that SynSerB3 alters regulation because separate control experiments showed that overexpressing E. coli SerB in BW25113 cells has little effect on the expression profile (Dataset S1, table 1).]
Fig. 2.
The de novo protein SynSerB3 increases expression of hisB, which encodes histidinol phosphate phosphatase. (A) Bars show the top 30 up-regulated transcripts for amino acid biosynthesis genes in pseudo-WT strain BW25113 expressing SynSerB3, relative to the same cells expressing native E. coli SerB (using RNAseq). (B) The abundance of hisB transcripts measured using both RNAseq and qPCR in both the pseudo-WT strain BW25113 and in the ΔserB auxotroph. The ratio of abundance is shown for cells expressing SynSerB3 relative to the same cells expressing native E. coli SerB. The dotted line represents a fold change of 1, indicating no change. Error bars represent the SE.
Next we examined expression profiles in the deletion strain, ΔserB, and compared cells expressing SynSerB3 with the same cells expressing native E. coli SerB. In this strain, many more genes were differentially expressed, presumably because of the added complexities associated with the chromosomal deletion of the serB gene. Analysis of the RNAseq data revealed that expression of 633 genes was altered fourfold or more in ΔserB cells expressing SynSerB3 versus the control. Some genes, such as those involved in fermentation and other carboxylic acid biosynthetic processes, were down-regulated; others, including those involved in the SOS response, were up-regulated. These results are discussed further below and in SI Discussion.
An advantage of the RNAseq method is its ability to probe the entire transcriptome in an experiment that is not biased by models or expectations. On the other hand, a disadvantage of RNAseq is that it can be difficult to determine which of the observed changes are responsible for the biological phenotype. In the current situation, several orthogonal experiments described in the next sections suggest that up-regulation of the his operon is responsible for the rescue of ΔserB cells by SynSerB3. In particular, we hypothesized that overexpression of HisB, which encodes histidinol phosphate phosphatase, might rescue the deletion of phosphoserine phosphatase. Therefore it was gratifying to see that expression of the his operon was enhanced 10-fold by expression of SynSerB3 in BW25113 cells (Fig. 2A). However, in the ΔserB strain, enhanced expression of the His operon was not observed by RNAseq. Therefore, we used a second method, quantitative PCR (qPCR), to measure changes in the expression of HisB in both strains. As shown in Fig. 2B, qPCR showed that hisB transcripts were increased in response to SynSerB3 in both cellular backgrounds: fivefold in ΔserB cells and 25-fold in BW25113 cells (Fig. 2B).
In summary, both model-free assays (RNAseq) and focused experiments (qPCR) demonstrated that the de novo protein SynSerB3 leads to an increase in the transcription of HisB, which encodes histidinol phosphate phosphatase.

SynSerB3 Activates Transcription of the His Operon.

The RNAseq and qPCR results described in the preceding section demonstrate that SynSerB3 increases the abundance of mRNA transcripts from the his operon. In principle, abundance could be increased either by stimulating the transcription of the mRNA or by slowing its degradation. To distinguish between these possibilities, we replaced the sequence of the long polycistronic his mRNA with GFP. As shown Fig. 3A, this construct contains all the his operon regulatory signals for transcription and translation but does not contain the his structural genes. Because these latter sequences presumably dictate the rate of his operon mRNA degradation, the induction of increased GFP fluorescence by SynSerB3 would indicate an increase in the rate of transcription initiated at the his regulatory region. As shown in the center pair of bars in Fig. 3B, SynSerB3 increased GFP fluorescence fivefold above controls, demonstrating that the de novo protein increases transcription of the his operon.
Fig. 3.
SynSerB3 increases transcription of the his operon. (A) Fusion of GFP to the regulatory region of the histidine biosynthesis operon. In His reg.GFP, the entire regulatory region is retained; in His reg.ΔattenGFP, the attenuator sequence is deleted. In both constructs, the structural genes of the his operon are replaced by GFP. (B) GFP fluorescence normalized for cell density (OD600) from overnight cultures. The pair of bars on the left show results for a control plasmid containing none of the his operon regulatory region. For this vector, minimal GFP is expressed in the presence of SynSerB3 and in the control protein SynGltA. The pair of bars in the middle correspond to the His reg.GFP construct shown in A containing all the his operon regulatory region. In this case, the presence of SynSerB3 increases GFP levels dramatically above those in the control. The pair of bars on the right correspond to the His reg.ΔattenGFP construct, which lacks the attenuator sequence. In this case, the presence of SynSerB3 does not increase GFP levels above those in the control. Error bars represent SE.
Transcription of the his operon in E. coli is natively regulated in two ways. First, transcription is stimulated in the stringent response (induced by amino acid starvation) by allosteric binding of ppGpp-DksA to RNA polymerase. Second, transcription is attenuated by high intracellular concentrations of histidine. Both mechanisms are regulated by sequences upstream of the his structural genes in a regulatory region that includes the promoter, his leader peptide (hisL), and attenuator sequence (13).
We performed several experiments to determine whether SynSerB3 affects regulation via the stringent response or the attenuation mechanism. First, to assess whether SynSerB3 activates the stringent response, we compared the known gene-expression signatures of this response, which includes increased expression of amino acid biosynthetic genes and decreased expression of stable RNAs (rRNA and tRNA) (1416), with the RNAseq data from the current study (Fig. S3) Although there are some correlations between the stringent response and the RNAseq results for ΔserB cells expressing SynSerB3, these correlations are not observed when SynSerB3 is expressed in pseudo-WT BW25113 cells. Thus, the genetic background of the cells, ΔserB, which leads to a shortage of serine, rather than expression of the de novo protein per se, is responsible for the observed correlation. (See further discussion in SI Discussion.)
Fig. S3.
The ΔserB background induces a stringent-like response, not expression of SynSerB3. Comparison of differential gene expression in this study (Top Row) and those performed by others (Left Column); “LexA regulated” is a complete list of genes thought to be regulated by LexA as determined by Fernandez de Henestrosa, et al. (20); “Serine hydroxamate treatment” lists genes determined at three time points after treatment for cells with serine hydroxamate as performed by Durfee, et al. (15); “ppGpp-mediated” lists were obtained by looking up affected genes on the EcoCyc database (14); and “Isoleucine starved” lists are those determined by Traxler, et al. (16). “Total genes” sums the total number of genes in each list that were altered by |log2foldchange|>2. (A) The number of genes that are present in both the list of genes in the top row and the list of genes in the left column. (B) The percentage of genes from the top row of genes also present in the left column of genes. Data are color-coded largest (green) to smallest (white).
For further confirmation that the stringent response is not regulated by SynSerB3, we constructed several different fusions of the his regulatory region to GFP. As shown in the center pair of bars in Fig. 3B, SynSerB3 has a substantial impact on expression of GFP from the full-length his regulatory sequence. However, deletion of the attenuator sequence equalizes GFP fluorescence between cells expressing SynSerB3 and those expressing the control protein (Fig. 3B, right pair of bars). These results indicate that SynSerB3 alters regulation via the attenuation mechanism and not by the ppGpp-mediated stringent response. (In constructs where the attenuation sequences are deleted, GFP fluorescence is high in cells with or without SynSerB3 because the his promoter is a strong promoter, and there is no stem-loop to attenuate transcription.) From these results we conclude that SynSerB3 activates transcription of the his operon by deattenuation.
In principle, there are several ways that SynSerB3 could cause deattenuation. For example, SynSerB3 could reduce the abundance of histidine, which is required for the attenuation mechanism. To address this possibility, we measured the concentration of histidine in cells expressing SynSerB3, relative to controls expressing WT SerB. As shown in Fig. 4, LC/MS showed that cells expressing SynSerB3 contain higher amounts of histidine, presumably because of the overexpression of the histidine biosynthesis genes. Therefore we can rule out a mechanism by which SynSerB3 reduces the abundance of histidine.
Fig. 4.
Histidine counts from LC/MS in pseudo-WT and ΔserB cells expressing either native E. coli SerB (WT SerB) or SynSerB3. There is more histidine in cells expressing SynSerB3 (purple bar) than in cells expressing WT SerB. Error bars represent the SE.
Another mechanism by which SynSerB3 could cause deattenuation would be by decreasing the amount of the single histRNA, encoded by hisR. To address this possibility, we used RT-qPCR to measure the amount of histRNA in both ΔserB cells and pseudo-WT BW25113 cells. We found no significant difference in abundance between cells expressing SynSerB3 and controls expressing native SerB, thereby demonstrating that the de novo protein does not function by altering histRNA levels.
Although SynSerB3 does not alter the intracellular abundance of histidine or histRNA, in principle it could alter the functional pool of histRNA by inhibiting successful charging of the histRNA. Alternatively, SynSerB3 could disrupt attenuation by binding the attenuator RNA.

Histidinol Phosphate Phosphatase Encoded by HisB Is Promiscuous and Catalyzes Hydrolysis of Serine Phosphate.

Our finding that SynSerB3 rescues the ΔserB auxotroph by causing overexpression of HisB suggests that histidinol phosphate phosphatase might have a promiscuous activity capable of catalyzing the hydrolysis of serine phosphate. This suggestion can be tested both genetically and biochemically. The genetic studies were first reported by Patrick et al. (17), who conducted a large-scale study to test whether chromosomal deletions of single genes in E. coli could be rescued by other E. coli genes overexpressed from a plasmid. In the case of the ΔserB auxotroph, Patrick et al. (17) found that overexpression of HisB rescued the deletion. We repeated this experiment and observed the same result. In related experiments, Blank et al. (18) searched for chromosomal mutations that rescue deletions. In the case of ΔserB, they also found rescuing mutations that either enhanced expression of HisB or relaxed its specificity.
The ability of histidinol phosphate phosphatase to hydrolyze serine phosphate was confirmed biochemically by Yip and Matsumura (19), who reported a kcat/Km value of 7.6 M/s. This promiscuous off-target activity of the HisB-encoded enzyme is 10,000-fold lower than the native activity of the E. coli SerB enzyme for the same substrate. This dramatic difference in catalytic activity is consistent with the requirement that HisB must be overexpressed for it to rescue ΔserB. Indeed, the auxotrophy of the ΔserB mutant shows explicitly that endogenous chromosomal expression of HisB is not sufficient to enable the growth of ΔserB cells on minimal medium.

The E. coli HisB Gene Is Required for Rescue of ΔserB Cells by SynSerB3.

The experiments described above demonstrate that the de novo protein SynSerB3 increases expression of the E. coli gene HisB. Because HisB encodes a phosphatase, these results suggested that rescue by SynSerB3 of the phosphoserine phosphatase deletion in ΔserB is mediated by the overexpression of an alternative E. coli phosphatase with promiscuous activity.
However, other transcripts also were increased or decreased in response to SynSerB3, and the observation of enhanced transcription of HisB does not explicitly prove that it is responsible for the rescue phenotype. To confirm that increased expression of HisB is essential for the rescue, it is crucial to show that rescue of the ΔserB auxotroph by SynSerB3 cannot occur in the absence of HisB. Therefore, we constructed the double-deletion strain, ΔserBΔhisB, and tested whether the de novo protein still could rescue the serine auxotroph in this genetic background. Because the deletion of hisB causes a requirement for histidine, which is tangential to the current study, we added histidine to the medium. The plasmid encoding SynSerB3 was transformed into the double-knockout strain, and no growth was observed after 14 d. This finding and appropriate controls are summarized in Table 1. Thus, in agreement with the hypothesis that SynSerB3 rescues ΔserB by enhancing expression of histidinol phosphate phosphatase, the presence of the HisB gene is indeed required for auxotroph rescue by SynSerB3.
Table 1.
Growth of auxotrophic strains of E. coli in minimal medium
StrainPlasmid expressingNo amino acid+His
As highlighted in bold and underlined, the de novo protein SynSerB3 rescues the ΔserB auxotroph only in the presence of a chromosomal copy of the endogenous E. coli HisB gene. Test tubes were monitored for 10 d, and plates were monitored for colonies with diameters >1 mm for 14 d. In the experiments described, growth occurred within 4 d. G, growth; X, no growth; –, experiment not performed.

The SOS Response Is Induced by Expressing SynSerB3 in ΔserB Cells Grown in Minimal Medium.

Analysis of RNAseq data for ΔserB cells expressing SynSerB3 showed enhanced expression of genes associated with the SOS response (Fig. 5A) (20). We further assayed this response using qPCR to measure the induction of two proteins associated with SOS: RecA, a central regulator of SOS, and SulA, a known inhibitor of cell division. We found that recA and sulA are induced when ΔserB cells expressing SynSerB3 are grown in minimal medium. Likewise a filamentous cell shape, also associated with SOS (21), was observed for ΔserB cells expressing SynSerB3 in minimal medium (Fig. S4). Both the filamentous morphology and the induction of recA and sulA were observed only in ΔserB cells expressing SynSerB3 in minimal medium. They were not observed when stress was relieved by using the pseudo-WT strain (in minimal medium) or by expressing SynSerB3 (in either strain) in rich medium (Fig. 5B and Fig. S4). Thus, it appears that overall stress, rather than the expression of SynSerB3, induces SOS.
Fig. 5.
The SOS response is turned on in ΔserB cells but not in the pseudo-WT strain. (A) Many genes in the SOS regulon are up-regulated (above the dotted line) in ΔserB cells expressing SynSerB3 in minimal medium. (B) Transcript levels of sulA and recA as measured by RNAseq and qPCR in both the pseudo-WT strain BW25113 and in ΔserB cells. In both A and B, bars show the abundance of transcript in cells expressing SynSerB3 relative to the same cells expressing native E. coli SerB. Error bars represent the SE.
Fig. S4.
Phase-contrast microscopy showing E. coli cells with filamentous or normal morphologies. (Scale bars, 10 μm.) Cells were imaged when cultures reached OD600 ∼0.3. For rich medium, this point was reached 4 h after induction. Cells were washed in PBS, immobilized on an agar pad, and imaged. (Magnification, 100×.) ΔserB cells expressing SynSerB3 in minimal medium form long filamentous structures, but relieving stress by expressing the native SerB, by using pseudo-WT cells, or by growing in rich medium restores cells to normal morphology. (A) ΔserB cells expressing SynSerB3 in minimal medium form long filamentous structures. (B) Pseudo-WT cells expressing SynSerB3 in minimal medium do not form filaments. (C) ΔserB cells expressing native E. coli SerB3 in minimal medium do not form filaments. (D) ΔserB cells expressing SynSerB3 in rich medium do not form filaments.
Nonetheless, we considered the possibility that the SOS response might be responsible for rescuing the ΔserB auxotroph. To test this possibility, we plated ΔserB cells on minimal medium spiked with sublethal concentrations (2–10 μg/mL) of nalidixic acid, a known inducer of SOS (22, 23). These cells did not grow on minimal medium, thereby confirming that induction of the SOS response is not responsible for rescue of the ΔserB auxotroph.
Further evidence that induction of the his operon by SynSerB3—and not the SOS response—is responsible for the rescue of ΔserB cells comes from several of the results described above. (i) Both RNAseq and qPCR (Fig. 2) show that SynSerB3 induces expression of the his operon in the BW25113 strain; however SynSerB3 does not induce filamentation, recA, or sulA in this strain (Fig. 5). (ii) GFP fusions to the regulatory region of the his operon show that SynSerB3 alters the regulation of this operon (Fig. 3) under conditions in which SOS and filamentation are not induced in BW25113 the strain. Thus, we conclude that the de novo protein SynSerB3 rescues ΔserB cells by inducing the promiscuous phosphatase encoded by HisB and that the SOS response results from the overall stress these cells experience while growing in minimal medium using a rescue mechanism that is not robust, but is just sufficient to sustain life.

SI Materials and Methods


Oligonucleotide primers and the codon-optimized SynSerB3 gene sequence were purchased from Integrated DNA Technologies. Cultures were grown in LB medium or M9-glucose minimal medium (1×M9 salts, 0.4% glucose, 2 mM MgSO4, 100 μM CaCl2). Selective agents and inducers were used at the following concentrations: kanamycin (kan, 30 μg/mL), chloramphenicol (30 μg/mL), IPTG (50 μM). Histidine was added to cultures at a final concentration of 12 μg/mL, and serine was added at a final concentration of 200 μg/mL. The pUA66 plasmid was a gift from A. J. Link, Princeton University, Princeton.


Keio parent cells are strain BW25113, with genotype [Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, rph-1, Δ(rhaD-rhaB)568, hsdR514] and ΔserB cells are [ΔserB784::kan] in the BW25113 background; both strains were obtained from the E. coli Genetic Stock Center (cgsc.biology.yale.edu/). ΔserBΔhisB cells are [ΔhisB720, ΔserB784::kan] in the BW25113 background; this strain was made using standard P1 transduction methods (32). Briefly, P1 virus lysate was made using the donor strain ΔserB::kan. The kanamycin cassette in ΔhisB::kan cells was excised using plasmid pCP20 (33). ΔhisB cells then were transduced with ΔserB::kan P1vir lysate and were plated on selective medium. Confirmation of the strain was performed with the locus-specific primers listed in Table S1.

DNA Methods.

Electroporation and heat shock transformations were done according to standard protocols (34). For rescue experiments, after transformation, cells recovered in SOC medium (34) for 1 h with shaking and then were washed twice with 1×M9 before plating. LacZ, WT-SerB, SynSerB3, and its mutants were expressed from vector p3Glar, a derivative of pCA24N. Point mutations and the nucleotide-insert mutants were made using standard QuikChange PCR (Agilent). GFP promoter fusions were made by performing PCR on Keio parent colonies using primers surrounding the selected regions and cloning into the BamHI-XhoI sites of pUA66. (Primers are listed in Table S1.)

Growth Curves.

Overnight cultures were normalized to the same OD600, washed twice with 1×M9, and then were inoculated in 96-well plates in a 1:1,000 dilution into minimal medium containing the appropriate antibiotics and IPTG. Mineral oil was added to the top of the inoculum to minimize evaporation. Cells were grown at 37 °C in a shaking Varioskan plate reader (Thermo Scientific), and the optical density was measured at 600 nm every 10 min.

Protein Expression and Purification.

ΔserB cells containing plasmid-born SynSerB3 or control plasmids were used to inoculate a 5-mL starter culture overnight. The starter culture was diluted 1:100 into 1 L of LB medium, and the culture was induced with 50 mM IPTG when the OD600 ∼0.5. The cultures were grown for an additional 8 h to allow protein expression. Cells were harvested and lysed using an EmulsiFlex homogenizer (Avestin). Proteins were purified using a HisTrap HP column (GE) with 100 mM Tris (pH 7.2) + 500 mM NaCl binding buffer and 100 mM Tris (pH 7.2) + 500 mM NaCl + 500 mM imidazole elution buffer. Eluates were loaded onto a HiLoad 16/600 Superdex 75 pg column (GE) running in 50 mM Tris (pH 7.2) + 150 mM NaCl.

Phosphatase Assay.

The malachite green assay was performed using the published protocol (11) in the buffers below with purified protein. NMR assays were performed with 100 μL cell lysate, 100 μL 500 mM phosphoserine, 50 μL D2O, and 250 μL 50 mM Tris buffer (pH 7.4) containing some or all of the following components: 150 mM NaCl, 5 mM MgCl2, 3 mM EDTA, 1 mM DTT. Cell lysates were prepared as above but with a 25-mL culture. 13C NMR was performed on a 500-MHz (125 MHz 13C) Bruker Avance III instrument with a cryoprobe with a 2.09-s acquisition time and 1.90-s relaxation delay.

RNAseq and Differential Expression Analysis.

Cells were grown in an overnight culture, washed twice with 1×M9 salts, and used to inoculate minimal M9-glucose medium. When cells reached midlog phase, they were incubated with two volumes of RNAlater reagent (Thermo Fisher Scientific). Cells were harvested and stored at −20° until all samples could be prepped concurrently. Total RNA was prepared using an RNeasy Mini Kit (Qiagen), and RNA quality was assayed using a NanoDrop and Agilent 2100 Bioanalyzer. Ribosomal RNA was extracted using Ribo-Zero rRNA removal kit for bacteria (Illumina). The library was prepared using the Illumina TruSeq protocol, and sequencing was performed on an Illumina HiSeq instrument. Reads were mapped to the E. coli genome (National Center for Biotechnology Information reference number NC_000913.2) using TopHat2 (35), and read counts for each gene were obtained using htseq-count (36) on a Galaxy server (3739). A standard differential expression analysis was performed with DEseq2 (40) package. The design formula for the all-sample quality control was ∼population + treatment (Fig. S2). The design formula for pairwise differential expression analysis was ∼batch + treatment. Using the P value cutoff P < 0.05 and a magnitude cutoff for the log2(fold change) |log2foldchange| >2, we analyzed genes using DAVID (https://david.ncifcrf.gov.).


RNA was isolated as described above. DNA was removed using the TURBO DNA-free kit (Ambion). cDNA was prepared from 2 μg RNA using the SuperScript First-Strand Synthesis System (Life Technologies). RT-qPCR reactions were made in a final volume of 20 μL containing 10 μL Power SYBR Green PCR master mix (Applied Biosystems), 2.4 μL of 10-μM stocks of each amplicon primer, and 4 μL cDNA. Primers used are listed in Table S1. qPCR was performed using an Applied Biosystems ABI 7900 instrument with conditions as follows: 50 °C for 2 min, 95 °C for 10 min, [95 °C 15 s, 60 °C 1 min] × 40, 95 °C for 15 s, 60 °C 15 s, and 95 °C 15 s. Each sample had three technical replicates, and one plate contained three biological samples grown from three separate cultures. Transcript levels were normalized to the rrs mRNA to determine relative expression levels using the 2-ΔΔCq method (41).

GFP-Reporter Assay.

BW25113 cells containing two complementary plasmids, a pUA66-derivative containing the his promoter region fused to GFP or pUA66 (42) and a p3Glar-derivative expressing either SynGltA or SynSerB3, were cultured on agar plates, and a single colony was used to inoculate LB medium. IPTG was added to the overnight culture to induce protein expression, or glucose was added to repress leaky protein expression from the p3Glar plasmid. After 12 h, the overnight LB cultures were washed twice with PBS, and an aliquot was placed in a 96-well plate to be read in a Varioskan plate reader. The OD600 and fluorescence at 509 nm were measured. The experiment was performed in quadruplicate.

LC/MS Metabolomic Analysis.

Cells were grown in minimal medium exactly as for RNAseq profiling. To extract cellular metabolites, 3.0 mL of cells at OD600 = 0.3–0.5 were quickly vacuum-filtered onto nylon membranes with a 0.45-μm pore size (Millipore). Membranes were transferred into 60-mm Petri dishes containing 1.2 mL of cold (−20 °C) extraction solvent (40:40:20 methanol/acetonitrile/H2O, HPLC grade) and were extracted at −20 °C for 15 min. Membranes then were washed with cold extraction solvent in the dish. The extracts were collected in microcentrifuge tubes and centrifuged at 15,000 × g at 4 °C. A portion of the supernatant (500 μL) was transferred to a new tube, and supernatants were dried under N2 gas. Metabolites were resuspended in HPLC-grade H2O and analyzed by reversed-phase ion-pairing LC coupled to a stand-alone Orbitrap mass spectrometer by negative-ion mode electrospray ionization (43). Metabolite peaks that matched the retention times and mass-to-charge rations of authenticated standards were quantified using MAVEN (44). Abundances of metabolites were normalized to cell density (OD600). Three biological replicates for each sample were grown and extracted.

SI Discussion

One possible mechanism by which SynSerB3 may induce expression of the his operon is by inducing the stringent response. Other studies have shown that the stringent response induces a wide variety of changes in gene expression, including induction of the his operon. We probed whether there was any correlation between the expression of SynSerB3 and induction of the stringent response. Fig. S3 shows the number of genes found both on the lists of altered genes in our study and in studies that explicitly induce the stringent response.
Genes differentially expressed in pseudo-WT + SynSerB3 are 0–30% similar to other studies. The greatest overlap is between pseudo-WT + SynSerB3 up-regulated genes and genes positively regulated by ppGpp. Upon close examination, the only overlap in the genes up-regulated in pseudo-WT + SynSerB3 and those positively regulated by ppGpp list are the genes in the his operon, excluding the leader peptide hisL. This finding indicates that in the pseudo-WT cells, SynSerB3 functions specifically at the his operon. None of the genes known to be negatively regulated by ppGpp are down-regulated in pseudo-WT + SynSerB3 cells. Thus, in pseudo-WT cells we do not observe a correlation between the expression of SynSerB3 and the induction of the stringent response.
When SynSerB3 is expressed in ΔserB cells, expression profiles show a small correlation with those reported previously for the stringent response. About 11% of genes up-regulated after serine hydroxamate treatment and 22% of genes up-regulated after isoleucine starvation were also up-regulated in ΔserB + SynSerB3 cells. In addition, 19–27% of genes down-regulated after serine hydroxamate treatment and 13% of genes down-regulated after isoleucine treatment were down-regulated in ΔserB + SynSerB3 cells. However, 22% of the amino acid genes known to be up-regulated by ppGpp were down-regulated in ΔserB + SynSerB3 cells (argI, leuO, and thrABCL). None of the ribosomal proteins known to be negatively regulated by ppGpp was down-regulated in ΔserB + SynSerB3. Because correlations with the stringent response are observed only in ΔserB + SynSerB3 cells, not in pseudo-WT + SynSerB3 cells, we attribute these effects to the ΔserB genetic background and not to the expression of SynSerB3.
We can analyze these correlations by their intended metabolic effect on cells. The serine hydroxamate treatment blocks the seryl-tRNA synthetase and thereby mimics serine starvation (15), which also is present in ΔserB cells. Isoleucine starvation was induced by growing cells in minimal medium with an excess of all 19 amino acids except isoleucine; valine toxicity was a confounding metabolic effect (16). Therefore, because of the complexities of amino acid starvation present in all these systems, it is not surprising that there are similarities between these two studies that induced a stringent response and the results obtained in the current study. We propose that a stringent-like response is induced in ΔserB cells because of the general inefficiency of SynSerB3’s rescue mechanism in ΔserB cells. In fact, we see a correlation between the induction of the stringent response and SynSerB3 expression only in the ΔserB background and not the pseudo-WT background. Because the stringent response is not activated in pseudo-WT cells expressing SynSerB3, we conclude that the expression of SynSerB3 does not induce a generic stringent response but instead induces expression of the his operon.
We also observe similarities between genes up-regulated in ΔserB + SynSerB3 cells and genes thought to be regulated by LexA, the transcription factor that regulates the SOS response (20). We observed this similarity using the DAVID database and confirmed it using the gene list of Fernández de Henestrosa et al. (20). The SOS response was activated in the study by Durfee et al. (15) when the stringent response was activated, so these cellular stresses have several overlaps.
The SOS response often leads to filamentous phenotypes in E. coli (21). As described in the main text, expression of SynSerB3 enables ΔserB cells to grow in minimal medium. Because these cells grow slowly and with a significant lag phase (Fig. 1C), we questioned whether they also grow with altered cell morphology. Therefore, we examined these cells and several controls using phase-contrast microscopy. As shown in Fig. S4, ΔserB cells expressing SynSerB3 in minimal medium form long filamentous structures. This morphology likely explains the apparent lag in the growth curve. To probe the importance of (i) the strain, (ii) the expressed protein, and (iii) the growth medium in causing this phenotype, we changed each of these variables independently. We tested the strain by growing pseudo-WT cells (BW25113) expressing SynSerB3 in minimal medium; we tested the protein by expressing native E. coli SerB in ΔserB cells growing in minimal medium; and we tested the medium by growing ΔserB cells expressing SynSerB3 in rich medium. As shown in Fig. S4, changing any of these variables restored the cells to normal morphology. Thus, expressing SynSerB3 per se does not cause filamentation.
Although the exact mechanism responsible for the observed filamentation is not clear, this is not the first example of a de novo protein causing aberrant cellular morphology. Stomel et al. (45) reported that expression in E. coli of an artificial ATP-binding protein also caused extensive filamentation. ATP sequestration by an artificial protein (45) or by excessive histidine biosynthesis (this study) [which requires 41 ATP molecules per molecule of histidine (46)] could cause filamentation.
There also is evidence that overexpression of the histidine biosynthetic enzymes HisH and HisF can cause filamentation in E. coli (47). However, our data show that pseudo-WT cells expressing SynSerB3 do overexpress His biosynthetic enzymes but do not form filaments. These results indicate that the aberrant morphology apparently results from the overall stress associated with growing the deletion strain in minimal medium while relying on a relatively inefficient rescue mechanism.


The number of sequences that can be encoded by an alphabet of 20 amino acids far exceeds the number of atoms in the universe. Even for relatively short sequences of 102 amino acids, there are many more possibilities (5 × 10132) than could have been sampled by evolution. From this large universe of possibilities, nature selected relatively small collections of sequences (proteomes) to sustain the growth of living organisms. Thus, the E. coli genome encodes only 4,300 proteins (24), and even the human genome contains only ∼20,000 sequences (25, 26).
These considerations led us to question whether novel protein sequences, never sampled by nature, might be able to provide essential functions necessary to sustain life. To address this possibility, we imposed life-or-death selections to isolate functional proteins from a library of 1.5 million de novo sequences. Several novel sequences were discovered which rescued a range of different deletions (9). Among these, SynSerB3 was isolated as a rescuer of ΔserB, an auxotroph in which phosphoserine phosphatase, which catalyzes the last step in serine biosynthesis, was deleted.
At the outset, we anticipated that the novel protein would rescue ΔserB by functioning in the same way as the deleted enzyme, i.e., by catalyzing the hydrolysis of phosphoserine. However, extensive studies demonstrated that SynSerB3 is not active as a phosphoserine phosphatase and therefore must exert its life-sustaining phenotype by functioning as a regulator of endogenous genes and/or proteins.
Through a series of experiments including both unbiased searches (RNAseq) and targeted analyses (qPCR and GFP fusions), we showed that SynSerB3 enhances expression of HisB, which encodes histidinol phosphate phosphatase, by deattenuating the his operon. Further experiments showed that (i) histidinol phosphate phosphatase has promiscuous activity that can hydrolyze phosphoserine in vitro, and (ii) this promiscuous activity, when expressed at high levels, is sufficient in vivo to sustain the growth of ΔserB cells on minimal medium. After showing that HisB was sufficient for rescue, we confirmed that it was necessary by demonstrating that the ability of SynSerB3 to rescue ΔserB cells absolutely requires the presence of the HisB gene. Taken together, these findings demonstrate that a novel protein, unrelated to sequences in nature, can enable cell growth by altering the natural regulatory landscape.
Evolution is an opportunistic process. When presented with environmental challenges, organisms adapt by using a range of strategies, including new enzymatic functions and novel regulatory processes. In natural systems, there are numerous precedents showing that selecting for growth under specified conditions can yield mutations that rewire gene regulation (2731). For example, Taylor et al. (27) showed that immotile mutants of Pseudomonas fluorescens subjected to selections for motility evolve by repurposing a protein that normally functions in nitrogen uptake toward a new function involving flagellar regulation.
Our studies demonstrate that nonnatural, synthetic biological systems can also surmount environmental challenges. Rather than mutating a naturally occurring gene, as in the above examples, we demonstrate that a de novo-designed protein can drive adaptive changes in gene expression. To the best of our knowledge, this is the first example of a de novo protein that provides a life-sustaining regulatory function.

Materials and Methods

For additional information on cell growth and manipulation, protein purification, biochemical assays, differential gene expression analysis, and metabolomics profiling, please see SI Materials and Methods. Datasets associated with gene expression are given in Dataset S1. Primers are listed in Table S1.
Table S1.
Primers used in this study for quick-change PCR, cloning, RT-qPCR, and checking strains
Quick-change primers
Promoter cloning into pUA66
Locus-specific primers


We thank Dr. Betsy Smith for plasmids encoding frameshift and stop codons; the microarray core facility at the Lewis-Sigler Institute for Integrative Genomics at Princeton for RNA sequencing; Lance Parsons for help in analyzing the RNAseq data; and Xin Teng and Prof. Josh Rabinowitz for performing metabolite LC/MS. This research was funded by National Science Foundation (NSF) Grant MCB-1050510. K.M.D. was supported by an NSF Graduate Research Fellowship.

Supporting Information

Supporting Information (PDF)
Supporting Information


S Kamtekar, JM Schiffer, H Xiong, JM Babik, MH Hecht, Protein design by binary patterning of polar and nonpolar amino acids. Science 262, 1680–1685 (1993).
MH Hecht, A Das, A Go, LH Bradley, Y Wei, De novo proteins from designed combinatorial libraries. Protein Sci 13, 1711–1723 (2004).
LH Bradley, RE Kleiner, AF Wang, MH Hecht, DW Wood, An intein-based genetic selection allows the construction of a high-quality library of binary patterned de novo protein sequences. Protein Eng Des Sel 18, 201–207 (2005).
Y Wei, S Kim, D Fela, J Baum, MH Hecht, Solution structure of a de novo protein from a designed combinatorial library. Proc Natl Acad Sci USA 100, 13270–13273 (2003).
A Go, S Kim, J Baum, MH Hecht, Structure and dynamics of de novo proteins from a designed superfamily of 4-helix bundles. Protein Sci 17, 821–832 (2008).
R Arai, et al., Domain-swapped dimeric structure of a stable and functional de novo four-helix bundle protein, WA20. J Phys Chem B 116, 6789–6797 (2012).
SC Patel, LH Bradley, SP Jinadasa, MH Hecht, Cofactor binding and enzymatic activity in an unevolved superfamily of de novo designed 4-helix bundle proteins. Protein Sci 18, 1388–1400 (2009).
I Cherny, M Korolev, AN Koehler, MH Hecht, Proteins from an unevolved library of de novo designed sequences bind a range of small molecules. ACS Synth Biol 1, 130–138 (2012).
MA Fisher, KL McKinley, LH Bradley, SR Viola, MH Hecht, De novo designed proteins from a library of artificial sequences function in Escherichia coli and enable cell growth. PLoS One 6, e15364 (2011).
T Baba, et al., Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: The Keio collection. Mol Syst Biol 2, 0008 (2006).
TP Geladopoulos, TG Sotiroudis, AE Evangelopoulos, A malachite green colorimetric assay for protein phosphatase activity. Anal Biochem 192, 112–116 (1991).
DL Nelson, MM Cox Principles of Biochemistry (W.H. Freeman and Company, 5th Ed, New York, 2008).
P Alifano, et al., Histidine biosynthetic pathway and genes: Structure, regulation, and evolution. Microbiol Rev 60, 44–69 (1996).
IM Keseler, et al., EcoCyc: A comprehensive database of Escherichia coli biology. Nucleic Acids Res 39, D583–D590 (2011).
T Durfee, AM Hansen, H Zhi, FR Blattner, DJ Jin, Transcription profiling of the stringent response in Escherichia coli. J Bacteriol 190, 1084–1096 (2008).
MF Traxler, et al., The global, ppGpp-mediated stringent response to amino acid starvation in Escherichia coli. Mol Microbiol 68, 1128–1148 (2008).
WM Patrick, EM Quandt, DB Swartzlander, I Matsumura, Multicopy suppression underpins metabolic evolvability. Mol Biol Evol 24, 2716–2722 (2007).
D Blank, L Wolf, M Ackermann, OK Silander, The predictability of molecular evolution during functional innovation. Proc Natl Acad Sci USA 111, 3044–3049 (2014).
SH-C Yip, I Matsumura, Substrate ambiguous enzymes within the Escherichia coli proteome offer different evolutionary solutions to the same problem. Mol Biol Evol 30, 2001–2012 (2013).
AR Fernández De Henestrosa, et al., Identification of additional genes belonging to the LexA regulon in Escherichia coli. Mol Microbiol 35, 1560–1572 (2000).
JW Little, DW Mount, The SOS regulatory system of Escherichia coli. Cell 29, 11–22 (1982).
CP O’Byrne, N Ní Bhriain, CJ Dorman, The DNA supercoiling-sensitive expression of the Salmonella typhimurium his operon requires the his attenuator and is modulated by anaerobiosis and by osmolarity. Mol Microbiol 6, 2467–2476 (1992).
JM Schoemaker, RC Gayda, A Markovitz, Regulation of cell division in Escherichia coli: SOS induction and cellular location of the sulA protein, a key to lon-associated filamentation and death. J Bacteriol 158, 551–561 (1984).
FR Blattner, et al., The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997).
ES Lander, et al., Initial sequencing and analysis of the human genome. Nature; International Human Genome Sequencing Consortium 409, 860–921 (2001).
JC Venter, et al., The sequence of the human genome. Science 291, 1304–1351 (2001).
TB Taylor, et al., Evolution. Evolutionary resurrection of flagellar motility via rewiring of the nitrogen regulation system. Science 347, 1014–1017 (2015).
D Li, et al., A de novo originated gene depresses budding yeast mating pathway and is repressed by the protein encoded by its antisense strand. Cell Res 20, 408–420 (2010).
D Li, Z Yan, L Lu, H Jiang, W Wang, Pleiotropy of the de novo-originated gene MDF1. Sci Rep 4, 7280 (2014).
ZD Blount, JE Barrick, CJ Davidson, RE Lenski, Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature 489, 513–518 (2012).
A Rodríguez-Verdugo, O Tenaillon, BS Gaut, First-Step Mutations during Adaptation Restore the Expression of Hundreds of Genes. Mol Biol Evol 33, 25–39 (2016).
JH Miller Experiments in Molecular Genetics (Cold Spring Harbor Lab Press, Cold Spring Harbor, NY, 1972).
KA Datsenko, BL Wanner, One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci USA 97, 6640–6645 (2000).
J Sambrook, DW Russell, Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). (2001).
D Kim, et al., TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14, R36 (2013).
S Anders, PT Pyl, W Huber, HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
D Blankenberg, et al., Galaxy: A web-based genome analysis tool for experimentalists. Current Protocols in Molecular Biology, eds Ausubel FM, et al. Chapter 19:Unit 19.10.11–21. (2010).
B Giardine, et al., Galaxy: A platform for interactive large-scale genome analysis. Genome Res 15, 1451–1455 (2005).
J Goecks, A Nekrutenko, J Taylor, Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol; Galaxy Team 11, R86 (2010).
MI Love, W Huber, S Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014).
KJ Livak, TD Schmittgen, Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25, 402–408 (2001).
A Zaslaver, et al., A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat Methods 3, 623–628 (2006).
W Lu, et al., Metabolomic analysis via reversed-phase ion-pairing liquid chromatography coupled to a stand alone orbitrap mass spectrometer. Anal Chem 82, 3212–3221 (2010).
E Melamud, L Vastag, JD Rabinowitz, Metabolomic analysis and visualization engine for LC-MS data. Anal Chem 82, 9818–9826 (2010).
JM Stomel, JW Wilson, MA León, P Stafford, JC Chaput, A man-made ATP-binding protein evolved independent of nature causes abnormal growth in bacterial cells. PLoS One 4, e7385 (2009).
ME Winkler, S Ramos-Montañez, Biosynthesis of histidine. Ecosal Plus 3, 1–34 (2009).
N Frandsen, R D’Ari, Excess histidine enzymes cause AICAR-independent filamentation in Escherichia coli. Mol Gen Genet 240, 348–354 (1993).

Information & Authors


Published in

Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 113 | No. 9
March 1, 2016
PubMed: 26884172


Submission history

Published online: February 16, 2016
Published in issue: March 1, 2016


  1. de novo protein design
  2. serB
  3. hisB
  4. auxotroph
  5. synthetic biology


We thank Dr. Betsy Smith for plasmids encoding frameshift and stop codons; the microarray core facility at the Lewis-Sigler Institute for Integrative Genomics at Princeton for RNA sequencing; Lance Parsons for help in analyzing the RNAseq data; and Xin Teng and Prof. Josh Rabinowitz for performing metabolite LC/MS. This research was funded by National Science Foundation (NSF) Grant MCB-1050510. K.M.D. was supported by an NSF Graduate Research Fellowship.


This article is a PNAS Direct Submission.



Katherine M. Digianantonio
Department of Chemistry, Princeton University, Princeton, NJ 08540
Michael H. Hecht1 [email protected]
Department of Chemistry, Princeton University, Princeton, NJ 08540


To whom correspondence should be addressed. Email: [email protected].
Author contributions: K.M.D. and M.H.H. designed research; K.M.D. performed research; K.M.D. and M.H.H. analyzed data; and K.M.D. and M.H.H. wrote the paper.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations


Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by


    View Options

    View options

    PDF format

    Download this article as a PDF file


    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to access the full text.

    Single Article Purchase

    A protein constructed de novo enables cell growth by altering gene regulation
    Proceedings of the National Academy of Sciences
    • Vol. 113
    • No. 9
    • pp. 2319-E1328







    Share article link

    Share on social media