Superantigenic character of an insert unique to SARS-CoV-2 spike supported by skewed TCR repertoire in patients with hyperinflammation

Significance A hyperinflammatory syndrome reminiscent of toxic shock syndrome (TSS) is observed in severe COVID-19 patients, including children with Multisystem Inflammatory Syndrome in Children (MIS-C). TSS is typically caused by pathogenic superantigens stimulating excessive activation of the adaptive immune system. We show that SARS-CoV-2 spike contains sequence and structure motifs highly similar to those of a bacterial superantigen and may directly bind T cell receptors. We further report a skewed T cell receptor repertoire in COVID-19 patients with severe hyperinflammation, in support of such a superantigenic effect. Notably, the superantigen-like motif is not present in other SARS family coronaviruses, which may explain the unique potential for SARS-CoV-2 to cause both MIS-C and the cytokine storm observed in adult COVID-19.


Supplemental Information and Supplemental Figures Generation of a binary complex between SARS-CoV-2 spike and T cell receptor (TCR)
SARS-CoV-2 spike model in the prefusion state was generated using SwissModel (1) based on the resolved cryo-EM structure (Protein Data Bank (PDB): 6VSB) (2) for the S glycoprotein where one of the receptor binding domains (RBDs) is in the up conformation and the other two in the down conformation. The structure of the T cell receptor (TCR) containing both -and -chains was taken from the crystal structure (PDB: 2XN9) of the ternary complex resolved for human TCR, staphylococcal enterotoxin H (SEH) and human major histocompatibility complex class II (MHCII) molecule (3). Using protein-protein docking software ClusPro (4), we constructed in silico a series of binary complexes for SARS-CoV-2 spike and TCR. We obtained 30 clusters of conformations for spike-TCR binary complexes, upon clustering the ~1000 models generated by ClusPro. The clusters were rank-ordered by cluster size, as recommended (4). We analyzed all models and found that a large fraction showed that TCR bound to spike via its constant domain. Given that the constant domain is proximal to the cell membrane and TCR employs the variable domain for binding superantigens (SAgs) and/or antigen/MHC complexes (3), we then added restraints to our docking simulations to filter out those conformers where the variable domain would bind to the spike. This led to 27 clusters (based on a set of 666 models) from ClusPro. Interestingly, in 45% of the generated models, the TCR was observed to bind to a spike epitope that contained the "PRRA" insert; and in 46% of models we observed an interaction between the TCR and one or two of the three RBDs.
Thus, we identified two hot spots for TCR binding within the SARS-CoV-2 spike: one overlapping with the "PRRA" insert and the other on the RBD surface. Representative members belonging to the top-ranking clusters are presented in Fig. S1. Panels A and B illustrate two cases where the TCR tightly binds to the PRRA insert region (of monomers 2 (dark red) and monomer 1 (gray), respectively); and panels C and D illustrate two cases where the TCR binds to RBDs.

Generation of a binary complex between SARS-CoV spike and TCR
SARS-CoV (SARS1) spike model in the prefusion state was generated using SwissModel (1) based on the cryo-EM structure resolved for SARS-CoV spike (PDB: 6ACD) (5) where one of the RBDs is in the up conformation, and the other two in the down conformation. Following the same approach as we did for SARS-CoV-2 spike, we constructed in silico a series of binary complexes for SARS-CoV spike and TCR using ClusPro (4). Using the same filtering procedure, this led to 30 clusters (based on 686 models), among which 38% showed the binding of TCR to multiple RBDs (see Fig. S2A) similar to the behavior observed (in Fig. S1C-D) for SARS-CoV-2. Differently, 48% of the models showed the binding of TCR to two S2 subunits near the C-terminal domain of the trimers (see Fig. S2B). No significant binding of TCR near the S1/S2 cleavage site RS668 of the SARS1 spike was observed. Note that the residues S664LLRS668 of SARS1 spike, which are sequentially aligned against the SARS-CoV-2 spike T678SPRRARS686 containing the "PRRA" insert (see Fig. 3 panel A), lack the polybasic character of their counterpart SARS-CoV-2. The lack of TCR binding to this region is consistent with the absence of this motif in SARS-CoV, which serves as a strong attractor in SARS-CoV-2. Figure S1: Top-ranking complexes between SARS-CoV-2 spike and T-cell receptor (TCR) predicted by ClusPro. (A-B) Binding of TCR near the "PRRA" insert region of monomer 2 (dark red) and monomer 1 (gray) in the respective panels A and B, showing that the PRRA insert and its close vicinity presents a highaffinity binding site for TCR. (C-D) Binding of TCR near the RBD of a subunit, indicating that the RBD is an alternative high-affinity site. The spike trimer subunits are colored dark red, orange, and gray. The PRRA insert region (E661 to R685) is colored yellow. The TCR -and -chains are shown in magenta and cyan, respectively. See more details on the interaction between the PRRA insert region and TCR in Fig. 1

Generation of a binary complex between MERS-CoV spike and TCR
MERS-CoV spike model was generated using SwissModel (1) based on the cryo-EM structure resolved for MERS-CoV spike (6) (PDB: 5X5F) in which one of RBDs is in the up conformation. 30 clusters (based on 588 models) were predicted by ClusPro (4). 56% of models led to TCR binding to the RBDs. Two representative poses from these most populated clusters are shown in Figs. S2 C-D, which are comparable to those observed in SARS-CoV-2 spike (Fig. S1C-D) and SARS-CoV spike (Fig. S2A). Simulations also indicated that TCR could bind near the S1/S2 cleavage site region of MERS-CoV spike (segment D726-R751; counterpart of SARS-CoV-2 E661-R685 at the Cterminus of subunit S1). Note that at this region the PRRA insert of SARS-CoV-2 spike is replaced by MERS-CoV spike sequence PRSV. The region near PRSV shows a tendency to bind TCR but it is weaker than that of SARS-CoV-2 spike due to the lack of the critical residues (e.g. N679 and R683 in SARS-CoV-2 spike) that are involved in the interface the spike makes with the TCR. The lack of polybasic residues at this sequence motif, as well as counterparts of N679 and R683 of SARS-CoV-2 spike renders this structural region less attractive to TCRs, suggesting that MERS-CoV might not harbor a superantigen-like motif near its S1/S2 cleavage site.   (8) by Li et al (2003) to be related to neurotoxins (row #s 3, 5 and 7; colored green), intercellular adhesion molecule (ICAM)-1 like protein (rows # 2 and 4; colored orange), bullous pemphigoid antigen 1-e (row # 9; white) and other bioactive molecules (remaining rows). We display the optimally aligned SARS-CoV-2 sequence under the SARS-CoV sequence in each of the 9 motifs, as well as the corresponding residue ranges. The last column shows the sequence identity between the two sequence fragments. The conserved ICAM-I like motif (residues 279-301 in SARS-CoV-2 spike) contains residues that interact with TCR Vα (marked in red). See Fig. 5 for more information on the neurotoxin-like motif in row 5.
Note that the neurotoxin-like sequence #5, residues 299-351, contains several fragments (15mers) that were recently shown (7) to stimulate T cell reactivity (illustrated in Fig. S4). Figure S4: Position of SARS-CoV-2 S cross-reactive epitopes identified (7) in people who have not been exposed to SARS-CoV-2, which overlap with the neurotoxin-like fragment 299-355 identified here to have a strong affinity to bind TCRs. The positions of eight cross-reactive epitopes (15-mers each, with the starting amino acid shown in each case) that were recognized by CD4+ T cells are indicated by blue and red bars. In each case the corresponding reactivity strength (SFC/10 6 cells) and the number of donors (out of a total of 16) who showed this type of 'memory' response (presumably due to earlier human coronavirus infections) are written. Two of the epitopes were found in our docking simulations to bind TCRs (see Fig  5). Note that this is one of three neurotoxin-like regions on SARS-CoV-2 spike (highlighted in green in Figs  S3 and 5A). The other two regions also contained epitopes that were cross-reactive, but this one was distinguished by its high frequency (fraction of donors) and high strength (SFC).

Generation of a ternary complex between SARS-CoV-2 spike, TCR, and MHCII
The 3-dimensional structure of the human MHCII was taken from the crystal structure of the ternary complex (3) (PDB: 2XN9) between human TCR, SEH and MHCII. First, we performed docking simulations to generate binary complexes between MHCII and SARS-CoV-2 spike. Six representative MHCII-spike binary complexes were selected to explore further docking of TCR to form a ternary complex. We analyzed all predicted ternary complex models of MHCII-Spike-TCR. Potential ternary MHCII-Spike-TCR complex models were selected following three filtering criteria: (i) TCR binds the "PRRA" insert region or the RBD; (ii) the binding region preferably includes one or more of SARS-CoV-2 spike segments that are sequentially homologous to the superantigen or toxin binding motifs predicted for SARS-CoV (Fig. S3); (iii) MHCII and TCR are in close proximity. These filters led to the MHCII-Spike-TCR complex model illustrated in Fig. S5A.
Interestingly, the SARS-CoV-2 spike binding region harbors three residues that have been recently reported to have mutated in new strains from Europe and USA(9, 10) (Fig. S5B): D614G, A831V and D839Y/N/E). While we do not exclude the possible occurrence of other potential ternary complexes, especially those involving the RBDs, we focused here on the complex shown in Fig.  S5, which uniquely satisfied the three criteria.

In silico mutagenesis of D839 of SARS-CoV-2 spike
We mutated D839 of the SARS-CoV-2 spike in silico to asparagine, glutamic acid and tyrosine in line with the aforementioned mutants D839Y/N/E observed in a new strain from Europe. To this aim, we used PyMOL mutagenesis tool (11) and evaluated the change in local conformation and energetics in the complex formed with TCR. The most probable rotamers were selected and energetically minimized in the presence of the bound TCR (conformation shown in Fig. 1) using OpenMM (12). The resulting conformations are shown in Fig. S6. These were further subjected to short (1 ns) molecular dynamics (MD) simulations for equilibration and energy minimization under the AMBER14 ff14SB forcefield (13). Five independent runs were carried out for each mutant (Y, N, or E, at the position 839) as well as the wild type (D839) spike, to assess the statistical significance of the results for each case. Binding affinities (ΔG) were evaluated for the final conformations of (i) the full complex (with the intact spike and entire TCR as interactors) or (ii) a single spike subunit and TCRVβ, at 37 °C using PRODIGY server (14,15). The results are presented in Table S1.

Analysis of NGS immunosequencing data from COVID-19 patients
Blood collection from 38 patients (42 samples) with mild/moderate COVID-19, and 8 patients (24 samples) with severe/hyperinflammatory COVID-19 was performed under institutional review board approval number 2020-039. The patients and controls, and their immune repertoires, were part of a previously published cohort (16). For details of NGS data acquisition, please refer to our earlier work (16). Only productive TRB rearrangements were used and all repertoires were normalized to 20,000 reads. For the analyses, we used R version 3.5.1 for plotting of TRBV and TRBJ gene usage as previously described (17,18).

Generation of complexes between SARS-CoV-2 spike, SAg-specific TCRs and MHC II
Four TCR Vβ genes (TRBV5-6, TRBV14, TRBV13 and TRBV24-1) were found to be overrepresented in severe/hyperinflammatory COVID-19 patients (Fig 6). We investigated the binding properties of the TCRs encoded by those genes. To this aim, we extracted from the UniProtKB (23) the amino acid sequences corresponding to these respective genes, used them in FASTA format to search for the corresponding structures, if any, in the Protein Data Bank (PDB) (19), using SwissModel (1). We found structural data in the PDB for TCRV chains of three of the genes, TRBV5-6 (UniProt id: A0A599), TRBV14 (A0A5B0), and TRBV24-1 (A0A075B6N3). The respective PDB structures have PDB ids: 6ULR (20), 2ESV (21), and 6EH6 (22). These structures contain both  and -chains and Severe/hyperinflammatory COVID-19, n=24 Age-matched healthy donor, n=23 their V domains have 95-100% sequence identity with the Vchains encoded by the respective TRVB genes. These PDB structures were used in docking simulations using the software ClusPro (4) to examine their binding properties with respect to the SARS-CoV-2 spike. 30 clusters (obtained upon grouping ~ 700 models) were generated for each of the TCRs complexed with the spike, and in each case there were 3 or more clusters where the TCR was bound to the SAg. Fig S8 panels A-C display representative conformers from these clusters. Panel D displays the multiple sequence alignment generated for the TCR V chains (with a few residues of the constant domain succeeding the CDR3). The binding paratopes are indicated by color-coded bars above the alignment. Simulations using the same protocol as the one adopted for generating Fig S5 predicted that ternary complexes with MHCII were also energetically favorable for all three cases. Panels E and F illustrate the ternary complexes with MHCII for the TCRs corresponding to TRBV5-6 and TRBV14.  (24). The TCR paratopes that bind to the SAg-like site on spike are indicated by the color-coded bars, and the CDRs are highlighted in green. Note that there is an additional segment, highlighted in green, which also includes residues making interfacial contacts with the SAg region of the spike, despite its sequence heterogeneity. (E-F) Ternary complexes with MHCII predicted for the overrepresented TCRs, illustrated for two cases.

MHCII
In silico models for TCR-Spike binary complexes TCR-MHCII-S ternary complex