Transmembrane domain of surface-exposed outer membrane lipoprotein RcsF is threaded through the lumen of β-barrel proteins

Significance In Escherichia coli, most outer membrane (OM) lipoproteins are thought to be soluble proteins that are simply tethered to the inner leaflet of this membrane by lipid moieties attached to the N terminus. Here we show that lipoprotein RcsF (regulator of capsule synthesis) adopts a transmembrane orientation with the lipidated N terminus on the cell surface and the folded C-terminal domain in the periplasm. The short, unstructured, polar linker domain spans the hydrophobic membrane by passing through the lumen of several different OM β-barrel proteins. This remarkable, interlocked structure is formed by the Bam complex, which folds and inserts all β-barrel proteins in the OM, suggesting that this assembly machine translocates the lipid moieties and then folds the β barrel around the RcsF linker. RcsF (regulator of capsule synthesis) is an outer membrane (OM) lipoprotein that functions to sense defects such as changes in LPS. However, LPS is found in the outer leaflet, and RcsF was thought to be tethered to the inner leaflet by its lipidated N terminus, raising the question of how it monitors LPS. We show that RcsF has a transmembrane topology with the lipidated N terminus on the cell surface and the C-terminal signaling domain in the periplasm. Strikingly, the short, unstructured, charged transmembrane domain is threaded through the lumen of β-barrel OM proteins where it is protected from the hydrophobic membrane interior. We present evidence that these unusual complexes, which contain one protein inside another, are formed by the Bam complex that assembles all β-barrel proteins in the OM. The ability of the Bam complex to expose lipoproteins at the cell surface underscores the mechanistic versatility of the β-barrel assembly machine.

RcsF (regulator of capsule synthesis) is an outer membrane (OM) lipoprotein that functions to sense defects such as changes in LPS. However, LPS is found in the outer leaflet, and RcsF was thought to be tethered to the inner leaflet by its lipidated N terminus, raising the question of how it monitors LPS. We show that RcsF has a transmembrane topology with the lipidated N terminus on the cell surface and the C-terminal signaling domain in the periplasm. Strikingly, the short, unstructured, charged transmembrane domain is threaded through the lumen of β-barrel OM proteins where it is protected from the hydrophobic membrane interior. We present evidence that these unusual complexes, which contain one protein inside another, are formed by the Bam complex that assembles all β-barrel proteins in the OM. The ability of the Bam complex to expose lipoproteins at the cell surface underscores the mechanistic versatility of the β-barrel assembly machine. membrane biogenesis | protein folding | signal transduction | envelope stress response I n addition to the cytoplasmic or inner membrane (IM), Gramnegative bacteria such as Escherichia coli have an additional outer membrane (OM) (1). The OM is essential for survival; it serves as a selective permeability barrier and protects the cell from toxic hydrophobic substances such as detergents, antibiotics, and other harmful chemicals (2). In E. coli, the OM is an asymmetric bilayer with phospholipids in the inner leaflet and LPS in the outer leaflet. LPS molecules have strong lateral interactions and are responsible for the barrier function of the OM. There are two main classes of proteins in the OM: the first class is integral transmembrane proteins often referred to simply as outer membrane proteins (OMPs). These proteins assume a conformation known as a hydrophobic β barrel. β barrels are assembled by the essential BamABCDE complex in which BamA is itself a β barrel (3). In contrast, the second class of proteins is lipoproteins, most of which have a soluble protein domain that is anchored to the OM by N-terminal lipid moieties.
The lipoprotein export pathway to the OM is well characterized (4). Lipoproteins are synthesized in the cytoplasm and are targeted to the inner membrane Sec translocon by a lipoprotein signal sequence (SS), which contains a "lipobox" with a highly conserved cysteine (Cys) residue to which a diglyceride is attached by a thioether bond (5). After lipid modification and SS processing, a third acyl chain is added to the Cys residue at the newly formed N terminus. Lipoproteins are sorted to either the IM or the OM based on a few amino acid residues, which directly follow the lipidated Cys by a process known in E. coli as the "+2 rule" (6). OM lipoproteins are targeted to the Lol machine, which consists of an inner membrane ATP-driven LolCDE complex that extracts lipoproteins from the IM (7); a periplasmic chaperone LolA, which escorts lipoproteins across the aqueous periplasmic space (8); and an OM acceptor lipoprotein LolB, which anchors lipoproteins to the inner leaflet of the OM (9). The assembly pathway for most lipoproteins is assumed to end here, leaving the molecule anchored in inner leaflet of the OM by lipid moieties with the protein domain facing the periplasm (4).
There are only a few reports about surface-exposed lipoproteins (10). In most of the cases, how those lipoproteins are transported across OM is not understood. Recently we have shown that a fraction of Braun's lipoprotein or Lpp, the most abundant protein in E. coli, is exposed on the cell surface (11). Lpp exists in two forms: a bound form in which Lpp is covalently linked to peptidoglycan (PG) and a free form, the function of which is not understood (12,13). We discovered that the free form of Lpp adopts an unusual conformation with its C terminus exposed on the cell surface (11). We wish to understand the mechanism by which lipoproteins are transported across the OM. However, because the function of the free form of Lpp is unknown, it is not a useful model for this analysis.
RcsF (regulator of capsule synthesis) is an OM lipoprotein that functions as the sensory component of the Rcs stress response (14). Rcs regulates the production of the colanic acid capsule, which stabilizes the OM and helps combat envelope stress (15). Capsule production results in a mucoid colony phenotype, which, together with a few available transcriptional lacZ fusions, serves as a functional read out of the system (16).
Although some envelope-associated signals can be sensed by periplasmically expressed RcsF (17), the strongest inducers of the Rcs system are LPS defects, caused either by mutations in LPS biosynthesis (rfa) (18) or drugs that target LPS directly, such as polymixin B (19). Sensing these LPS defects by RcsF strongly depends on RcsF localization to the OM (17). Because LPS is present in the outer leaflet of the OM, we therefore Significance In Escherichia coli, most outer membrane (OM) lipoproteins are thought to be soluble proteins that are simply tethered to the inner leaflet of this membrane by lipid moieties attached to the N terminus. Here we show that lipoprotein RcsF (regulator of capsule synthesis) adopts a transmembrane orientation with the lipidated N terminus on the cell surface and the folded C-terminal domain in the periplasm. The short, unstructured, polar linker domain spans the hydrophobic membrane by passing through the lumen of several different OM β-barrel proteins. This remarkable, interlocked structure is formed by the Bam complex, which folds and inserts all β-barrel proteins in the OM, suggesting that this assembly machine translocates the lipid moieties and then folds the β barrel around the RcsF linker.  hypothesized that at least a portion of RcsF should be present on the cell surface.
RcsF consists of several domains (20,21): a lipidated N-terminal Cys (residue 16), followed by a disordered proline-rich linker (residues , and a C-terminal folded domain (CTD, residues 49-134), the structure of which has been solved. The folded domain consists of a subdomain made up by short β1α1 elements (residues 49-62) and a C-terminal RcsF core domain (residues 63-134) formed by a central three-stranded β sheet, one side of which is covered by a long α2 helix. The CTD is believed to function in signaling because it is sufficient to induce the Rcs system when expressed periplasmically (20). The linker domain therefore might be a sensory and/or a regulatory domain, as some linker mutations that result in moderate signaling phenotypes were reported (22). However, the exact role of the linker and how it is involved in sensing defects is unknown.
Here, we report that RcsF is indeed a surface-exposed lipoprotein, and it adopts a transmembrane conformation with the lipidated N terminus exposed at the cell surface and the core domain in the periplasm. Strikingly, the transmembrane domain crosses the OM through the lumen of β-barrel OMPs, which shield this highly charged sequence from the hydrophobic membrane interior. In addition to its ability to interact with several OMPs, RcsF also interacts with BamA, an essential component of the β-barrel assembly machine. We provide several lines of evidence that formation of RcsF/OMP complexes happens during OMP folding by the Bam complex. This demonstration that lipoprotein export across OM is tied to β-barrel assembly underscores the striking mechanistic versatility of the β-barrel assembly machine.

Results
RcsF Is a Surface-Exposed Lipoprotein. Several experimental approaches are commonly used to identify proteins exposed on the cell surface. Among them are selective surface biotinylation, surface proteolysis, and antibody-based whole cell assays (23). We used each of these methods to test whether RcsF can be detected on the cell surface ( Fig. 1 and SI Appendix, Fig. S1). In a dot blot assay whole cells or a cell lysate of a WT or rcsF mutant strain were spotted onto a nitrocellulose membrane and probed with anti-RcsF polyclonal antibodies. We also used antibodies available in our laboratory against other lipoproteins (BamDCE and LptE) with known topology (Fig. 1A) as a control for cell envelope integrity. Because BamCDE and LptE are members of multiprotein complexes that might affect their detection, we also generated OM-targeted GFP (OM-GFP) by fusing the Lpp signal sequence to sfGFP as an additional control. Anti-RcsF antibodies bind to the whole cells and cell lysates in an RcsF-dependent manner, whereas other control antibodies bind to the lysate but not whole cells (Fig. 1A). RcsF can also be labeled by an OM impermeable NHS-LC-LC biotin probe, and it is partially sensitive to an extracellularly added protease (SI Appendix, Fig. S1 A and B). Taken together, these results suggest that at least a part of the RcsF molecule is exposed on the cell surface.
Next, we tested whether surface exposure of RcsF is dependent on an OM targeting signal. For that we generated two mutant RcsF proteins: one in which we introduced a Lol avoidance mutation (SM→DQ substitution at positions +2 and +3) that resulted in retention of RcsF in the inner membrane (RcsF IM ) (17) and one in which we replaced the lipoprotein signal sequence of RcsF with the well-characterized signal sequence of OmpA, which resulted in production of RcsF lacking an N-terminal lipid (RcsF SS-OmpA ). As shown in Fig. 1B, RcsF IM and RcsF SS-OmpA accumulated at WT levels but could not be detected on the cell surface, suggesting that OM targeting of RcsF by the Lol pathway is a prerequisite for surface localization.
To address the OM topology of RcsF, we used an epitope walking approach (23), which was successfully used to determine the topology of several E. coli OMPs (24)(25)(26). We introduced a FLAG epitope along the linker, as well as in the surface-exposed loop regions of the core domain. Of seventeen FLAG insertion variants made, nine were functional for signaling based on their ability to complement WT and activate the Rcs system in an rfa mutant background (SI Appendix, Fig. S1C). These functional FLAG insertion mutants were subjected to a dot blot assay with anti-FLAG and anti-RcsF antibodies (Fig. 1C). These FLAG insertions did not reduce RcsF levels or interfere with surface localization as anti-RcsF antibodies detected all mutant RcsFs on the cell surface. In contrast, anti-FLAG antibodies could not detect FLAG insertions in the core domain of RcsF. Interestingly, the first surface-detected FLAG insertion (R21) is only few amino acids away from the lipidated Cys residue, suggesting that the N-terminal lipid is anchored in the outer leaflet of the OM.   RcsF is a surface-exposed lipoprotein. (A) RcsF is detected on the cell surface in a dot blot assay. Whole cells or cell lysates of a WT or rcsF strain were spotted on the nitrocellulose membrane and probed with corresponding antibodies. For OM-GFP, the expression plasmid was introduced into the indicated strains. (B) Surface exposure of RcsF depends on OM targeting signal sequence. Whole cells expressing the indicated RcsF variants were probed with anti-RcsF antibodies in a dot blot assay (Upper) and total RcsF protein levels were analyzed by an immunoblot assay with anti-RcsF antibodies. (C ) The N terminus of RcsF is surface exposed. RcsF WT and RcsF-FLAG variants were expressed from a plasmid and whole cells or cell lysates were probed with anti-FLAG, anti-RcsF and anti-BamC (as a control of cell envelope integrity) antibodies in a dot blot assay. Total RcsF protein levels were analyzed by an immunoblot assay with anti-RcsF antibodies. RcsF-FLAG variants are indicated by the amino acid number after which the FLAG tag was introduced.
In addition to Cys16, the RcsF sequence encodes four Cys residues, which form two nonconsecutive disulfide bonds to stabilize the structure of the core domain (20,21). We performed cell surface labeling with Cys-specific cell-impermeable Mal-PEG label (20 kDa) to test whether the core domain of RcsF is present on the cell surface. For that we treated whole cells with a reducing agent to reduce disulfide bonds and then labeled cells with Mal-PEG. We observed labeling of BamA as expected (27) but not RcsF, even under conditions of RcsF overexpression (SI Appendix, Fig. S1D). Absence of RcsF Cys labeling further reinforced our epitope walking results, showing that the core domain of RcsF is not exposed on the cell surface.
These results suggest a transmembrane topology for RcsF with the lipidated N terminus and the linker domain surface exposed and the core domain in the periplasm. We refer to this topology as "N out-C in." This result is puzzling because the RcsF sequence does not encode a canonical transmembrane domain and the sequence that appears to span the membrane is highly charged. Therefore, to explain the transmembrane topology of RcsF, it seemed necessary to propose that surface-exposed RcsF must exist in a complex with a transmembrane OMP.
RcsF Interacts with Several OMPs and BamA. To identify the proteins interacting with RcsF, we used an in vivo chemical crosslinking approach in combination with high-resolution MS-based proteomics. For that, we generated a functional RcsF-Strep II tag fusion that allows affinity purification and used ethylene glycol bis(succinimidylsuccinate) (EGS) cross-linking to stabilize protein interactions. We purified RcsF-Strep using Strep-Tactin resin and subjected elution samples to SDS/PAGE, followed by in-gel digestion and high-resolution nano-flow ultra-highperformance LC MS (UPLC-MS). As a negative control for binding specificity, we used RcsF lacking a Strep tag, which was cross-linked and treated identically to the RcsF-Strep sample.
We identified three OMPs, OmpA, OmpC, and OmpF, as well as BamA as RcsF-interacting proteins, based on their prominent spectral count values (Table 1 and SI Appendix, Table S1). These proteins purified with RcsF-Strep but not RcsF (no tag) on the Strep-Tactin column, suggesting specific interaction. Because EGS can cross-link multiple members of protein complexes, we sought to confirm that RcsF interactions with OmpA, C, and F and BamA are direct. To confirm the interactions and to identify specific sites of RcsF interaction with the OMPs, we used sitespecific photo-activated cross-linking with a genetically encoded pBPA amino acid (28).
We generated 30 functional RcsF-Strep variants with pBPA sites distributed along the linker as well as throughout the surface of the core domain and tested their ability to cross-link to OMPs by Western blot analysis ( Fig. 2 and SI Appendix, Fig. S2). We observed that RcsF cross-links to OMPs at 10 sites of the linker domain and only a few surface sites in the core domain, many of which are in the linker facing region of the core domain ( Fig. 2). Strikingly, the 10 cross-linking sites of the linker and β1α1 subdomain (RcsF 28-60) create a continuous interface for the RcsF/OMPs interaction. We also noticed that the most robust cross-linking to BamA occurs along β strands of the core domain, suggesting that RcsF interacts with BamA in a distinct fashion from other OMPs (Fig. 2).
Our site-specific cross-linking data and RcsF topology suggest that the linker/β1α1 subdomain mediates interaction with OMPs, raising the possibility that this region may traverse the barrel lumen of the OMPs to cross the OM. To identify residues of OMPs that interact with RcsF, we used several RcsF-Strep variants with pBPA at the linker sites followed by in vivo UV cross-linking, Strep-Tactin purification, SDS/PAGE with in-gel digestion, and high-resolution nano-flow UPLC-MS to sequence and map cross-linked peptides. We were able to identify four such cross-linked species: two cross-linked peptides of RcsF G60pBPA to OmpC and two to OmpF (Fig. 3 A and B). Both identified sites of cross-linking for OmpC and both for OmpF are derived from unique and independently characterized cross-linked peptides. However, they fall into the same specific region of the OmpC/F barrel based on the crystal structure ( Fig.  3C) (29,30). Strikingly, these residues face the lumen of the barrel (Fig. 3C), providing strong evidence that at least part of RcsF molecule resides inside the OmpC/F barrel. The placement of identified residues is consistent with our epitope walking results. The last epitope insertion that was detected on the cell surface follows amino acid V59. The FLAG epitope is eight amino acids long and would be exposed on top of the barrel based on the site of G60 pBPA cross-linking.
While analyzing RcsFG60pBPA cross-linked samples, we also detected by MS a peptide with an N-terminal palmitoyl (C16:0) modification on a Cys residue (SI Appendix, Fig. S3). In addition, lipidated RcsF and nonlipidated RcsF migrate differently in SDS/PAGE, and we are unable to detect a band corresponding to nonlipidated RcsF in the WT strain (SI Appendix, Fig. S3). This result provides evidence that the lipid modifications are still present on RcsF in RcsF/OMP complexes.
We tested the effect of OMP deletions on RcsF surface exposure. Mutants individually deleted for ompA, ompC, or ompF, as well as an ompR mutant defective in transcription of both ompC and ompF genes, did not show decreased levels of surface exposed RcsF (SI Appendix, Fig. S2) nor did they exhibit defects in RcsF signaling except for the ompR mutation, which leads to twofold activation of RcsF. We cannot analyze the triple mutant, ompACF, because its OM is quite unstable due to the loss of all three major OMPs. Because of this OMP redundancy for RcsF localization, RcsF surface exposure is not dependent on any single OMP.
In the first approach, we tested whether mutations in the Bam complex can affect levels of RcsF/OMP complexes in vivo. We focused on the RcsF/OmpA complex because, unlike OmpC and OmpF, OmpA levels are not dramatically reduced in the different of bam mutants, simplifying the interpretation of the results (SI Appendix, Fig. S4).
In vivo EGS cross-linking results in the formation of an RcsF/ OmpA complex of molecular weight of ∼50 kDa, which corresponds to a 1:1 ratio of RcsF to OmpA (Fig. 4A). As expected, RcsF/OmpA cross-linking requires the RcsF lipoprotein signal sequence (SI Appendix, Fig. S5). Two other prominent crosslinked bands of molecular weight of ∼35 and 22 kDa were observed. The 35-kDa band is likely to be an RcsF cross-linked dimer. The 22-kDa band is the product of RcsF/Lpp cross-linking (SI Appendix, Fig. S6). We have not explored either of these interactions further. However, we note that Lpp is the most abundant protein in E. coli, and an lpp-null mutation did not affect either RcsF signaling or its surface exposure.
To test whether mutations in the Bam complex can affect levels of RcsF/OMP complexes, we used two different mutations in bam that affect Bam function to different extents. bamA101 is a mutation in the promoter region of bamA that causes a reduction in BamA levels (31). Thus, in this mutant, fewer Bam complexes are formed, but those Bam complexes that do form are made of WT proteins. The bamA101 mutation does not confer significant OM assembly or permeability phenotypes. Strains carrying the double mutation bamA616 (32) and bamB exhibit compromised Bam function (SI Appendix, Fig. S4). Importantly, levels of RcsF and OmpA are not affected in bamA101 and bamA616 bamB mutants under the conditions used (Fig. 4B). However, we observed a twofold decrease of RcsF/OmpA crosslinking in bamA101 and sevenfold decrease in bamA616 bamB. This decrease was specific to RcsF/OmpA because RcsF/Lpp cross-linking did not decrease in these mutants.
Both bamA101 and bamA616 bamB are Rcs-activating mutations judged by rprA-lacZ reporter fusions, and they also confer a mucoid phenotype. Therefore, decreased levels of RcsF/OmpA complexes in these strains could be a result of RcsF activation. To test this, we performed EGS cross-linking in an Rcs-activating rfaP mutant. In the rfaP strain, levels of RcsF/OmpA were similar to that of the WT (Fig. 4B). Therefore, we attribute the decrease in RcsF/OmpA cross-linking in the bam mutants directly to the reduced activity of the Bam machine.
In the second approach, we tested whether RcsF can interact with OmpA in vitro using purified proteins. We took advantage of a simplified system, in which OmpA can be folded in vitro in the presence of detergent. Detergent-folded OmpA has the same properties as OmpA folded by the Bam machine in vivo, e.g., the barrel is heat modifiable and protease resistant (33). We set up our experiment under two different reaction conditions. In the first reaction, we refolded OmpA in detergent and then added RcsF to the folded OmpA (Fig. 5A). In the second reaction, we folded OmpA in the presence of RcsF (Fig.  5B). We used UV photo cross-linking of RcsF-pBPA variants as measure of the RcsF/OmpA interaction. We purified an RcsF-pBPA variant at the linker site that interacts with OmpA in vivo (RcsF K40pBPA), as well as an RcsF-pBPA variant that does not interact with OmpA in vivo (RcsF D72pBPA) to serve as a negative control. When RcsF is added to previously refolded OmpA, no product corresponding to RcsFxOmpA was formed after UV treatment (Fig. 5A). When OmpA was refolded in the L19

Relative Intensity [%]
MS  presence of RcsF, we observed UV-induced formation of the RcsFxOmpA complex for RcsF K40pBPA but not for the negative control (Fig. 5B). This result indicated that RcsFxOmpA crosslinking in vitro is site specific and reconstitutes the cross-linking observed in vivo. Moreover, the RcsFxOmpA band was heat modifiable, confirming that OmpA was folded in this complex (Fig. 5B). Taken together, we conclude that the formation of the RcsF/OmpA complex occurs only during OmpA folding.

Discussion
Based on the data presented, we propose a model (Fig. 6) in which the lipidated N terminus and linker domain of RcsF are surface exposed, whereas the core domain remains in the periplasm. The short, but highly charged sequences of RcsF that traverse the OM do so by passing through the lumen of OMPs, and the formation of these RcsF/OMP complexes occurs during OMP folding by the Bam machine. RcsF does not have barrel specificity; RcsF can interact with multiple OMPs. This lack of OMP specificity explains why RcsF surface exposure is not dependent on any one single OMP. The RcsF/OMP complex provides another example of an OM protein complex that forms during assembly from components targeted to the OM by different pathways. In addition, it suggests that the Bam complex can translocate lipid-containing protein sequences from one membrane face to the other. OM lipoproteins have a variety of important functions: they participate in envelope biogenesis, stress responses, cell adhesion, virulence, etc. (10). Most of what we know about lipoprotein biogenesis comes from the Tokuda laboratory, where the Lol pathway was identified and characterized in great detail (4). For a long time, it has been assumed that anchoring of lipoproteins to the inner leaflet of OM by LolB is the last step in lipoprotein transport and that lipoproteins remain in this orientation with the protein domain facing the periplasm (4). Only recently have we started to appreciate that lipoproteins can be exposed on the cell surface (10). Most of these surface- The first reported surface-exposed lipoprotein is pullalanase, PulA, in Klebsiella oxytoca (34,35). Identification of PulA led to the discovery of its transport machinery (36), which later turned out to be a type II secretion system (T2SS) like that found in many different bacteria (37). Although Pul is a well-studied prototypical T2SS, lipoproteins are rather unusual substrates because most T2SS substrates are soluble proteins released into the media (38). Even though we know a lot about T2SS structure and function, it is still unclear how exactly PulA is anchored to the OM after it has been exported through the OM secretin PulD (37). Precise topology is known for only two surfaceexposed lipoproteins in E. coli, CsgG and Wza, and they are very different from RcsF. The lipidated N terminus of both is anchored to the inner leaflet of the OM, and both proteins have a helical transmembrane helix that oligomerizes to form an octomeric pore in the OM (39,40). Accordingly, the assembly of these proteins is likely to be Bam independent.
In other bacteria, such as spirochetes, lipoproteins are believed to be anchored in the outer leaflet of the OM and surface exposed by default (41,42). Both model lipoproteins OspA and OspC have a common feature: the presence of the N-terminal disordered linker domain, which serves as a cis-acting determinant for surface exposure (43,44). It is unclear how these lipoproteins are transported across the OM, but the presence of an OMP that serves as a lipoprotein "flippase" has been suggested (43).
We previously reported that a fraction of Lpp, the most abundant lipoprotein in E. coli, is exposed on the cell surface (11), whereas the other fraction remains in the periplasm where it is covalently bound to PG (12,13). It is hard to determine whether RcsF, like Lpp, exists in two different cellular locations. The labeling and cross-linking techniques used in this study are known to be low efficiency and hence cannot be used for direct quantification, and there is no functional assay to distinguish between surface exposed and periplasmic RcsF. However, we raise the possibility that RcsF might exist in different locations to sense diverse envelope defects and that perhaps the RcsF/OMP complex is used to detect certain alterations to the outer leaflet of the OM.
We used site specific cross-linking to identify residues of RcsF that interact with the OMPs and BamA. We showed that the linker together with the β1α1 subdomain create a continuous interaction interface with the OMPs. We also observed crosslinking along one face of the core domain. Two topology models could explain these cross-linking results. In the first model, the C-terminal end of the RcsF linker and the β1α1 subdomain would lie inside the barrel, whereas the core domain would interact with OMPs on the periplasmic side of the membrane. In an alternative model, the core domain also would reside inside the barrel and one side of its surface would be occluded from OMP interaction by the linker domain. Both of these models suggest that the core domain would not be detected on the cell surface and could explain our epitope walking results. However, we favor the first model, because the barrels of the RcsF-interacting OMPs simply are not big enough to accommodate the folded core domain.
In both reported structures of the soluble domain of RcsF, β1α1 creates a separate subdomain on the side of RcsF core (20,21). However, because this subdomain is somewhat distant from the core and lacks extensive contacts with the core, it might exist in a different orientation in the full length RcsF. Because sites of cross-linking to OMPs follow the linear sequences of the linker and β1α1 subdomain, we suggest this subdomain exists in an extended conformation in the RcsF/OMP complexes.
We mapped OmpC and OmpF peptides that cross-link to RcsF G60pBPA. Two peptides were found for each protein, and both peptides mapped to the same region of the corresponding OMP based on their crystal structure (29,30). OmpC and OmpF are structural homologs, and the fact that the RcsF interaction sites were similar for OmpC and OmpF suggests that this interaction is highly specific. Strikingly, the interaction site mapped to the extracellular loop 3 (L3), which folds back and occludes part of the lumen of both OmpC and OmpF. L3 has been previously implicated in the interaction with colicins, which use the lumen of OmpF as an entry way into the cell. Mutations in L3 are known to confer colicin resistance (45), and OmpF was successfully crystalized with the T83 peptide of colicin E9 and E3 inside the barrel lumen (30,46). The T83 peptide is an intrinsically unstructured part of the N-terminal transport domain of E colicins, and it is the first domain to enter OmpF lumen (47). Indeed, recent studies show that the unstructured colicin E9 N-terminal domain threads through two adjacent OmpF trimer subunits in opposite orientation (48). Our results suggest the nature of the RcsF unstructured linker interaction with OmpC/F might be similar to that of unstructured colicin linker.
We do not know how RcsF can be accommodated in the OmpA barrel. The available OmpA structure consists of two domains: an eight-stranded, N-terminal β-barrel domain and C-terminal periplasmic, peptidoglycan-binding domain (49). This small β-barrel domain of OmpA was subjected to extensive structural studies (50)(51)(52). These studies suggest that the OmpA lumen does not form a channel because of the presence of a salt bridge. However, recent biochemical studies have proposed an electro-chemical gate opening model in which the salt bridge can "open up," thus allowing the β barrel to function as a pore (53). Regardless of the pore state of OmpA, it is highly unlikely that the eight-stranded β-barrel domain of OmpA could accommodate the RcsF linker. However, there is also some biochemical, as well as physiological evidence that OmpA can adopt a socalled large barrel conformation in which the C-terminal domain can join the narrow barrel to form a large pore (54). The presence of the large pore conformation in vivo has remained exceedingly difficult to prove or disprove experimentally, and for this reason, this OmpA structure has been a matter of debate for decades. It is possible to delete the periplasmic domain, leaving the eight-stranded OmpA β barrel intact (55). Attempts to crosslink RcsF to this truncated OmpA have failed, perhaps indicating that RcsF interacts with the larger OmpA β barrel. However, because the structure of the large barrel remains unknown, it is difficult to test whether or not RcsF is inside the OmpA barrel. More work is required to clarify the nature of the RcsF-OmpA interaction.
Site-specific cross-linking demonstrates that the nature of the interaction of RcsF with BamA is distinct from its interaction with other OMPs. We suggest that RcsF interacts with BamA during RcsF/OMP assembly and that RcsF/OMP complexes are the product of this assembly reaction. Our in vitro experiments support this model by showing that RcsF/OmpA complexes must form during OmpA folding. Our in vivo data further support our model because mutations, which decrease Bam machine activity, lead to a reduction of RcsF/OmpA complexes, and the level of reduction correlates with the severity of the mutations. Whether or not the RcsF/BamA cross-linked complexes represent true assembly intermediates will require additional analysis. It is not unprecedented that the RcsF/OMP complex requires β-barrel folding and the Bam machine to form. The LptD/E complex is the first example of a β barrel/lipoprotein complex that requires Bam for assembly. LptD/E is the OM component of the LPS transport machine (56,57). LptE is an OM lipoprotein that resides inside the lumen of a β-barrel protein LptD (58). LptE has several functions: it is required for LptD assembly (59,60), plugging the LptD barrel (61), and LPS export across the OM (56,62). However, the RcsF/OMP complexes are fundamentally different from LptD/LptE. LptE resides almost completely inside the lumen of a large 26-stranded barrel of LptD (63,64). Interaction interface between two proteins is extensive, with a surface area >3,000 Å 2 (63,64). Moreover, folding of the LptD barrel absolutely requires not only Bam but also the presence of LptE; no LptD exists without its lipoprotein partner (59,60). In contrast, RcsF interacts with OMPs via a short sequence that exists in an extended conformation; RcsF does not have strict barrel specificity, and it can interact with multiple OMPs demonstrating a lack of extensive sequence-specific contacts. Unlike LptD, any of RcsF interacting OMPs can fold and function independent of RcsF. Based on ribosomal profiling, RcsF abundance was estimated to be around 3,000 molecules per cell. OmpA, OmpC, and OmpF are the most abundant OMPs, and they exist in at least hundred-fold excess to RcF (65). Therefore, RcsF is unlikely to interfere with OMP functions, and RcsF simply uses these OMPs to adopt its OM topology and reach the cell surface.
There is yet another fundamental difference between LptD/E and RcsF/OMP. Our results indicate that the lipid is still attached to RcsF, and they show that a FLAG tag located just four amino acids from the N-terminal Cys residue is exposed on the cell surface. This result suggests that the N-terminal lipids of RcsF are anchored, not in the inner leaflet as they are in LptE, but rather in the outer leaflet of the OM. How the lipids are flipped across the membrane to assume this outer leaflet localization remains unknown. Recently, it has been suggested that BamA causes a local distortion of the OM bilayer that allows BamA to release folded OMPs into the membrane (66,67). Such a distortion might also create a favorable environment for the RcsF lipids to flip, and thus BamA might serve the second function in RcsF/OMP biogenesis by catalyzing lipid translocation. Whether Bam activity is sufficient for lipid translocation and other mechanistic aspects of RcsF/OMP assembly by Bam will require additional research.
Bam is a β-barrel assembly machine that is also known to be able to assemble very complex substrates such as multimeric OMPs, autotransporters, which are β-barrels with secreted domains, and LptD/E. These substrates are fundamentally different in their structures, and their assembly pathways must require distinct activities of Bam. RcsF/OMP complexes represent yet another class of substrates assembled by this truly remarkable machine that operates in the absence of any obvious energy source.

Experimental Procedures
Dot Blot Assay. Strains were grown to midlog and concentrated to OD 600 = 1 in PBS (Na 2 HPO 4 , 10 mM; KH 2 PO 4 , 1.8 mM; KCl, 2.7 mM; NaCl, 137 mM; pH 7.4). Cell lysates were prepare by sonication of the cell suspension in PBS supplemented with 10 mM EDTA. Two microliters of cell suspension or cell lysate was spotted on a nitrocellulose membrane and air dried. Membranes were blocked with 2% (wt/vol) nonfat dried milk (NFDM) in PBS for 30 min at room temperature and probed with indicated antibodies (SI Appendix, Table  S4) for 1 h at room temperature. Membranes were washed five times for 3 min with PBS and probed with secondary antibodies for 1 h at room temperature.
In Vivo EGS Cross-Linking. Strains were grown to midlog, washed twice in PBS, and concentrated to OD 600 = 10 in PBS. Cells were incubated for 10 min at 37°C, and EGS (Pierce) was added to a final concentration of 1 mM for 20 min at 37°C and then quenched by Tris·Cl (100 mM final). Cells were pelleted and resuspended in SDS loading buffer and analyzed by immunoblotting.
Purification of EGS Cross-Linked RcsF and MS. Strains expressing RcsF-Strep tag or RcsF WT were grown in 1.5 L of Luria-Bertani (LB) medium with 0.02% arabinose until OD 600 = 1.8. In vivo EGS cross-linking was performed as described above. After the cross-linking, cells were pelleted and washed twice with TBS (20 mM Tris·Cl and 150 mM NaCl, pH 8.0). RcsF-Strep was subjected to affinity purification on a Strep-Tactin column and UPLC-MS as described in SI Appendix, Experimental Procedures. The strain expressing RcsF WT (no tag as a negative control) was treated the same way and was subjected to purification in parallel to RcsF-Strep. Experiments were performed with three independent biological replicates.
In Vivo UV-Induced Photo Cross-Linking. Strains were cotransformed with plasmids encoding the indicated RcsF Strep amber mutant and pSup (68). Strains were grown in LB with 0.16 mM pBPA (Bachem) until midlog. Cells were concentrated in ice-cold Tris-buffered saline (TBS) to OD 600 = 10. The cell suspension was subjected to UV irradiation at 365 nm for 15 min at 4°C; 2× SDS sample loading buffer was added, and samples were analyzed by immunoblotting.
Identification of RcsF G60pBPA Cross-Linked Peptides. Culture expressing RcsF-Strep G60pBPA was grown in 1.5 L LB with 0.16 mM pBPA and subjected to in vivo UV cross-linking as described above. RcsF-Strep G60pBPA was subjected to purification and UPLC-MS as described in SI Appendix, Experimental Procedures. pBPA cross-linked peptides were mapped with the StavroX 3.2.2 software (69). To validate cross-linked peptide matches made by StavroX, we determined the most abundant G60pBpa-containing peptide matched with high confidence by the SEQUEST HT search engine within Proteome Discoverer from a purified, non-cross-linked RcsF G60pBPA protein sample. We then programmed the mass of this peptide into SEQUEST HT as a variable modification specified at any residue. Identical RcsF-OmpC and RcsF-OmpF crosslinked peptides were found by SEQUEST HT as were mapped by the StavroX software. Details of the StavroX and SEQUEST HT searches are described in SI Appendix, Experimental Procedures.
In Vitro Detergent Folding of OmpA in the Presence of RcsF. FLAG-OmpA was expressed and purified as previously described (70). RcsF-Strep K40pBPA and RcsF-Strep D72pBPA were purified as described in SI Appendix, Experimental Procedures. In vitro refolding of OmpA was performed under two different conditions (1): 5 μM of FLAG-OmpA in 8 M urea was 10-fold diluted in TBS with 0.1% Triton X-100 and incubated at room temperature for 2 h. Purified RcsF-Strep K40pBPA and RcsF-Strep D72pBPA in TBS with 0.1% Triton X-100 was then added to a final concentration of 0.5 μM. For the control, an equal volume of buffer was added instead of RcsF. Reactions were incubated for 2 h at room temperature and then were split in half, and one sample was subjected to UV cross-linking at 365 mn for 15 min at 4°C; 4× SDS sample loading buffer was added to each sample, which was split in half again, and one half was boiled for 10 min. Samples were analyzed by immunoblotting (2). FLAG-OmpA (5 μM) in 8 M urea was 10-fold diluted in TBS with 0.1% Triton X-100 containing 0.5 μM of purified RcsF-Strep K40pBPA or RcsF-Strep D72pBPA and incubated at room temperature for 2 h. Samples were subjected to UV cross-linking and analyzed as described above.