Rational design of ASCT2 inhibitors using an integrated experimental-computational approach

Significance The glutamine transporter ASCT2 is an emerging therapeutic target for various cancer types. Here, we use an integrated computational and experimental approach to develop unique ASCT2 inhibitors targeting a conformational state useful for rational drug design. We apply computational chemistry tools such as molecular docking and molecular dynamics simulations, in combination with structure determination with cryo-electron microscopy and synthetic chemistry, to design multiple ASCT2 inhibitors. Our results reveal a unique mechanism of stereospecific inhibition of ASCT2 and highlight the utility of combining state-of-the-art computational and experimental approaches in characterizing challenging human membrane protein targets.

loop was contoured at 2 σ. To highlight the poorly resolved density for HP2 which is indicative of a higher flexibility the same region is countered at 4 σ and marked by a black rectangle. Superposition of the homology model (green cartoon with purple Lc-BPE) and initial cryo-EM structures (tan cartoon with dark gray Lc-BPE) with (A) "ligand up" and (B) "ligand down" conformations; showing the steric clash between the S354 sidechain and Lc-BPE. (C) Superposition of refined "ligand up" and "down" structures with the homology model. (D) Enrichment plots of ASCT2 homology model (green) and initial "ligand down" structure (tan) and after sidechain refinement (gray). The plot for a random selection of ligands is represented by the blue dashed line. Ligand up" clusters in red (69%), salmon (7%) and orange (6%). (C) "Ligand down" clusters in dark blue and cyan represent 13% of all clusters. The purple cluster represents an outlier cluster that is outside of the binding site and is unlikely to be physiological. (D) Stereochemical quality of the cryo-EM structure and metainference ensemble. We assessed the stereochemical quality of the cryo-EM structure (cyan, model 0) as well as of the six representative clusters from the metainference ensemble (red, models 1-6) using the MolProbity score 3 . This score provides a global measure of the quality of the models in terms of number of steric clashes, rotamer outliers, and percentage of backbone Ramachandran conformations outside favored regions. The lower the score, the better the quality of the models.

Fig. S11
Fig. S11. Superposition of ASCT2 "ligand down" and EAAT1 (PDB ID: 5MJU) structures. ASCT2 is shown in gray cartoon with Lc-BPE represented by pink sticks. EAAT1 is shown in green cartoon with the EAAT1 inhibitor TFB-TBOA represented by purple sticks. Key residues of the binding site are highlighted with sticks and labeled with the residue numbers of ASCT2 followed by EAAT1. Name marks the compound name. Structure corresponds to the 2D structure of the compound drawn by PerkinElmer ChemDraw Binding affinity marks the experimentally Ki values Docking score marks the docking score of the top scoring pose using Glide MM-GBSA corresponds to the predicted free energy of binding and corresponds with the docking parameters used and specified in the docking score column. Docking pose represents whether the ligand occupies a pose similar to the inhibitor in the "ligand up" or "ligand down" structures. Poses labeled "other" means the ligand pose was poor and did not reflect either of the "ligand up" or "ligand down" conformations. * IC50 (µM) measured in transport assay in proteoliposomes in the presence of 5 µM glutamine † Molecular docking to "ligand down" structure without constraints. ‡ Molecular docking to "ligand up" structure without constraints. "No current" indicates that no outward current (i.e., inhibition of anion leak current) was observed in these experiments. This means that the compound either does not bind or binds but is unable to block the leak anion current. Superscripts §, ||, ¶ and #indicate the highest inhibitor concentration tested in µM where; 20 ≤ § ≤ 50, 100 ≤ || ≤ 400, 500 ≤ ¶ ≤ 1000 and 1500 ≤ # ≤ 2500. Asterisk (**) indicates that an extra experiment (s) was performed in the presence of substrate (alanine or glutamate).

Supplementary Materials and Methods
We have previously developed homology models of human ASCT2 (hASCT2) based on the outward-open structures of the human EAAT1 transporter (EAAT1; 46% sequence identity) 4,5 . Here, we iteratively refined this model based on its ability to discriminate ligands from decoy compounds with ligand enrichment calculations. The sidechains of the binding site residues S354, D464, and C467 were remodeled on fixed backbone using PyMOL 6 and SCRWL4 7 , guided by recently determined structures of hASCT2 in multiple conformations. PyMOL mutagenesis was done with backbone dependent rotamers and SCRWL4 was done according to default parameters. Enrichment was done using a library of 29 known ASCT2 ligands, including substrates and inhibitors that were collected from the literature 1, [8][9][10][11][12] and ChEMBL 13 and 1,434 decoys generated with the DUD-E server 14 . Docking was performed with OpenEye FRED 15 as described in our previous work 5 . Hydrogen-bond constraints used for docking against the model at HP1; S351 and S353, and the refined structure at TM8; D464 and TM8; N471.

Molecular docking with Schrödinger
All docking calculations were performed with the Schrödinger suite using Glide v19-3 16 .
In brief, for the initial compound series (  16 . Compounds for the 2 nd compound series (Table 3) were docked in the unrefined and refined "ligand up" and "ligand down" cryo-EM structures, with and without constraints using Glide v19-3. Both protein and ligand preparation and was carried out as described above, and grid generation was performed with the Maestro Receptor Grid Generation panel using the coordinates of the reference ligand (Lc-BPE) from the cryo-EM structure. In general, two the "best" overall docking scores and poses came from refined structures without constraints. (Table 3).

Relative binding affinity prediction
We estimated the relative binding affinity between the compounds and the ASCT2 model binding site with mechanics generalized with born surface area solvation (MM-GBSA). These calculations were relative, where a more negative value was indicative of a better binding affinity. Here we used MM-GBSA with Prime from the Schrödinger suite v18-4) 16 . The model was prepared as described above with the exception that reference ligands were removed prior, and we used the docked ligands from above as input. Standard parameters were used, with the distance from the ligand set to 5 Å for all calculations. For the second compound series, the best docking poses and scores for the "ligand up" and "down" structures were calculated using docking from the refined structures without constraints.

MD simulations
The atoms of chain A and of the associated ligand were extracted from the 'ligand up' cryo-EM structure. CHIMERA 17 was used to select the cryo-EM density within 5 Å of the model to be used as input of the metainference simulation of the ASCT2 monomer. The starting model was prepared using CHARMM-GUI 18 . 113 POPC lipids were added to the system along with 11554 water molecules in a triclinic periodic box of volume equal to 616 nm 3 . 31 potassium and 30 chloride ions were added to ensure charge neutrality and a salt concentration of 0.15 M. The CHARMM36 force field 19 was used for the protein, lipids, and ions; the CHARMM General Force Field and the TIP3P model were used for the ligand and water molecules, respectively. A 30 ns-long equilibration was performed following the standard CHARMM-GUI protocol consisting in multiple consecutive simulations in the NVT and NPT ensembles. During these equilibration steps, harmonic restraints on the positions of the lipids, ligand, and protein atoms were gradually switched off.
In the metainference simulation, a Gaussian noise model with one error parameter for each voxel of the cryo-EM map was used. These variables were marginalized to avoid their explicit sampling, as done in previous applications 20,21 . 16 replicas of the system were used, and their initial configurations were randomly selected from the last 10nslong step of the equilibration protocol. The metainference run was conducted for a total aggregated time of 1.8 μs. All simulations were carried out using GROMACS 2019.6 22 and the ISDB module 23 of the open-source, community-developed PLUMED 24 library (GitHub ISDB branch; https://github.com/plumed/plumed2/tree/isdb). For the analysis, the initial frames of the trajectory of each replica, corresponding to 20% of the total simulation time, were considered as additional equilibration steps under the cryo-EM restraint and discarded. The remaining conformations from all replicas were merged together and clustered using: i) the Root Mean Square Deviation calculated on all the heavy atoms of the ligand and of the protein residues within 5 Å of the ligand in at least one member of the ensemble; ii) the gromos algorithm 25 with a cutoff equal to 2.5 Å.
PDB of the starting model, topology files, GROMACS inputs for equilibration and production runs, PLUMED input files for the metainference simulation, and analysis scripts are available on PLUMED-NEST 24 (www.plumed-nest.org) under accession code plumID:20.015.

Electrophysiological techniques
Electrophysiological experiments were performed as described previously 1,2 . Stock solutions of inhibitor was prepared in dimethyl sulfoxide (DMSO) up to 100 mM.
Dilutions to working concentrations were made using external buffer. The highest DMSO concentration used (2%) did not affect electrophysiological results in control cells. For rASCT2, hASCT2 and hASCT1, external buffer contained 140 mM NaCl, 2 mM MgCl2, 2 mM CaCl2, and 10 mM HEPES, pH 7.40 while internal pipette solution comprised of 130 mM NaSCN, 2 mM MgCl2, 10 mM EGTA, 10 mM HEPES and 10mM alanine, pH 7.40. For specificity experiments with EAAT1, EAAT2, EAAC1 and EAAT5, internal solution contained 10 mM glutamate instead of alanine. Compounds were applied to HEK293 cells expressing DNA of interest suspended from a current recording electrode in whole cell configuration 44 through a rapid solution exchange device described previously 45 . Cells are immersed in external buffer bath used to dissolve the inhibitors. The open pipette resistance was between 3 and 6 MΩ. Series resistance was not compensated in these experiments due to relatively small currents. Currents traces were recorded using an Adams and List EPC7 amplifier and digitized using a Molecular Devices Digidata A/D converter.

Data analysis
Data analysis was performed as described previously 46

ASCT2 expression and purification
Human ASCT2 (hASCT2) was produced in Pichia pastoris X-33 strain (Invitrogen) in fermentor 26 and purified using DDM and CHS (Anatrace) and 1 mM L-glutamine (Merck) to maintain protein stability 27  Peak elution fractions were immediately used in further procedures.

Reconstitution into proteoliposomes and transport assays
Freshly purified ASCT2 was reconstituted in the liposomes composed of Escherichia coli polar lipids and egg phosphatidylcholine at a 3:1 ratio (w/w) and supplemented with 10% (w/w) cholesterol (Avanti Polar Lipids) 27 . For transport assays proteoliposomes were loaded with 50 mM NaCl and 10 mM or 5 mM glutamine using three freeze-thawing cycles, then extruded 11 times through a 400-nm-diameter polycarbonate filter (Avestin), diluted in buffer D (20 mM Tris pH 7.0) and collected during ultracentrifugation (45 min, 442,907 × g, 4 °C). Proteoliposomes were resuspended in buffer D (~1 μg protein per 1.5 μl) and used in the transport assays carried out in a water bath at 25 °C with constant stirring. Transport was initiated by dilution of 1.5 μl proteoliposomes in 80 μl external buffer (50 mM NaCl and 50 μM or 5 μM [3H]glutamine (PerkinElmer) in 20 mM Tris pH 7.0). Inhibitors or equivalent amounts of DMSO were added to the external buffer mixture. At indicated time points the reaction was stopped by diluting the mixture in 2 ml of cold buffer D, filtered over a 0.45-μm pore-size filter (Portran BA-85, Whatman), washed with 2 ml of cold buffer D and filtered again. The level of radioactivity accumulated inside the proteoliposomes, as a consequence of amino-acid exchange, was counted using a PerkinElmer Tri-Carb 2800RT liquid scintillation counter after dissolving the filter in 2 ml of scintillation liquid (Emulsifier Scintillator Plus, PerkinElmer).

Reconstitution of ASCT2 in nanodiscs
An aliquot of mixture of E.coli polar lipids and egg phosphatidylcholine

Cryo-EM sample preparation and data collection
Freshly purified hASCT2 nanodiscs were concentrated to ~1 mg ml−1 using an Amicon Ultra-0.5 mL concentrating device (Merck) with a 100 kDa filter cut-off and then 100 μM inhibitor was added and incubated for 1 h on ice. 2.8 μl of the sample were applied onto the holey-carbon cryo-EM grids (Au R1.2/1.3, 300 mesh, Quantifoil), which were preliminary glow discharged at 5 mA for 30 s, blotted for 3-4 s in a Vitrobot Mark IV (Thermo Fisher) at 15 °C and 100% humidity, plunge frozen into a liquid ethane/propane mixture and stored in liquid nitrogen until further use. Screening of the grid areas with best ice properties was done with the help of a self-written script to calculate the ice thickness 29 . Cryo-EM data in selected grid regions were collected inhouse on a 200-keV Talos Arctica microscope (Thermo Fisher) with a post-column energy filter (Gatan) in zero-loss mode, with a 20-eV slit and a 100-μm objective aperture. Images were acquired in an automatic manner with EPU (Thermo Fisher) and SerialEM on a K2 summit detector (Gatan) in counting mode at ×49,407 magnification (1.012 Å pixel size) and a defocus range from −0.9 to −1.9 μm. During an exposure time of 9 s, 60 frames were recorded with a total dose of about 53 electrons/Å2. On-the-fly data quality was monitored using FOCUS software 30 .

Image processing
For the ASCT2 nanodiscs dataset in the presence of inhibitor Lc-BPE, 6,233 micrographs were recorded. Beam-induced motion was corrected with MotionCor2_1.2.1 31 and the CTF parameters estimated with ctffind4.1.13 32 . Recorded micrographs were manually checked in FOCUS (1.1.0), and micrographs, which were out of defocus range (<0.4 and >2 μm), contaminated with ice or aggregates, and with a low-resolution estimation of the CTF fit (>4 Å), were discarded. The remaining 5,991 micrographs were imported in cryoSPARC v2.14.2 33 . Around 1000 particles were manually picked to create templates for particle autopicking. 3,666,842 particles were autopicked and extracted with a box size of 200 pixels. After 2D classification 1,801,520 particles were left and majority of particles (1,427,784) were used for several rounds of ab-initio volume generation, and C3 symmetry was applied. 404,284 particles of best classes and 373,736 particles remaining after 2D classification (778,020 particles in total) were exported from cryoSPARC and imported in RELION-3.0.8 34 and used in 3D classification with C3 symmetry applied and resulted in a one best class with 300,899 particles (38.7%). These particles were used in the refinement job, where hASCT2 map generated in cryoSPARC was used as a reference and was low-pass filtered to 15 Å, and C3 symmetry was applied. In the last refinement iteration, a mask excluding nanodisc was used and the refinement continued until convergence (focus refinement), following postprocessing job, which resulted in a map at 4 Å. Four rounds of per-particle CTF refinement and beam tilt refinement in Relion3 8 improved resolution to 3.43 Å.
A similar approach was performed for the image processing of the ASCT2 nanodiscs dataset in the presence of ERA-21, In short, 9,788 micrographs were recorded, and 7,934 used for image processing after selection. 4,438,395 particles were autopicked and subjected to 2D classification in cryoSPARC. 1,308,359 selected particles were imported in RELION 3.1.0, refined and subjected to several rounds of per-particle CTF refinement. These particles with corrected defocus values were used for subsequent 3D classification with C3 symmetry applied, after which 363,193 particles (28%) were refined, subjected to 2 rounds of per-particle CTF refinement and used in 3D classification. The best class included 224,884 particles (62%) and after refinement and postprocessing resulted in a map at 3.61 Å. Three rounds of per-particle CTF refinement in Relion3 8 improved resolution to 3.37 Å.
To check for conformational heterogeneity, we performed 3D classifications without imposed C3 symmetry at different stages of image processing, and we did not find other conformations present. We also did 3D classifications of individual protomers after symmetry expansion and signal subtraction to check for conformational heterogeneity within the trimer. All particles were clustered in one class indicating the presence of only one protein conformation within the trimer. Bayesian polishing in RELION3 35 did not lead to further improvement in maps resolution. The resolution was estimated using the 0.143 cut-off criterion 36 with gold-standard Fourier shell correlation (FSC) between two independently refined half-maps 37 . During post-processing, the approach of highresolution noise substitution was used to correct for convolution effects of real-space masking on the FSC curve 38 . The directional resolution anisotropy of density map was quantitatively evaluated using the 3DFSC web interface (https://3dfsc.salk.edu) 39 .

Cryo-EM model building and validation
Models were built in COOT 40 using the previously determined ASCT2 structure 41 in detergent as reference. The resolution of the maps was of a good quality to unambiguously assign the protein sequence and model most of the residues (47-489).
Blurring of the final maps to b-factors -100 Å 2 and -50 Å 2 helped to control loops fitting. We note that the tip of HP2 (G434 -A426) was poorly resolved.
The empirically determined "ligand up" orientation was different from the docking pose of Lc-BPE in the ASCT2 model ("ligand down" ; Fig. S8A). The homology model of ASCT2 guided alternative orientations of S354, D464, and C467 sidechains allowing fitting "ligand down" conformation of the Lc-BPE in the observed cryo-EM density. Indeed, the refined cryo-EM structure for the "ligand down" ligand binding conformation obtained an improved enrichment score (AUC of 84.6 and logAUC 44.9, compared to AUC 69.5 and logAUC 32.4), supporting the remodeling (Fig. S8D).
Real-space refinements were performed in Phenix 42 with NCS restrains option. The quality of the fit was validated by a Fourier shell cross correlation (FSCsum) between the refined model and the final map. To monitor the effects of potential overfitting, random shifts (up to 0.5 Å) were introduced into the coordinates of the final model, followed by refinement against the first unfiltered half map. The FSC between this shaken-refined model and the first half map used during validation refinement is termed FSCwork, and the FSC against the second half map, which was not used at any point during refinement, is termed FSCfree. A marginal gap between the curves describing FSCwork and FSCfree indicates no overfitting of the model. The SBGrid software package tool was used 43 . Images were prepared with PyMOL 6 , ChimeraX 43 , and Chimera 17 .

Synthesis
Chemicals were purchased from VWR or Sigma-Aldrich. Except for one isomer, all other compounds were synthesized following the same general procedure as shown in the reaction scheme.

General synthesis procedures: Scheme 1 and Scheme 2 Step (a) Scheme 1: (tert-butoxycarbonyl)-4-hydroxypyrrolidine-2-carboxylic acid (a), (b)
or (c), (500 mg, 2.16 mmol) and imidazole (744 mg, 10.80 mmol) were weighed into an oven-dried round bottomed flask (RBF) and dissolved in dry DMF (6 mL). The reaction mixture was cooled to 0 0 C and TBSCl in dry DMF (652 mg, 4.32 mmol) was added dropwise under N2 gas. The reaction mixture was then left to warm up to room temperature and stirred for 24 hours. After removal of excess DMF using N2 gas at 50 0 C, the residue was suspended in ethyl acetate and washed twice with water, thrice with chilled 1 M HCl and once with brine. The organic layer was dried over anhydrous Na2SO4 and filtered. The filtrate was concentrated under reduced pressure to a colorless oil. The oil was dissolved in methanol (3 mL) and THF (4 mL) and the solution cooled to 0 0 C. LiOH . H2O (227 mg, 5.40 mmol) in water (3 mL) was added dropwise and the mixture was allowed to warm to room temperature and stirred for 2 hours. The pH of the solution was adjusted to 2-3 using chilled 1 M HCl and the product was collected as a pure, white precipitate after suction filtration.
Step ( dropwise under N2 gas and the reaction mixture was allowed to warm up to room temperature and was stirred for 48 hours. The contents were filtered off and the filtrate concentrated in vacuo. The residue was suspended in 50% ethyl acetate in hexanes and washed with NaHCO3 (3x), chilled 0.5M HCl (3x), brine (3x) and H2O (1x). The organic phase was collected and dried over Na2SO4 and filtered. The filtrate was concentrated in vacuo and the product purified using flash silica gel chromatography (10 -50% ethyl acetate in hexanes) to obtain a pure white solid.
Step ( -2-carboxylic acid 1 (a). Synthesized according to general procedure for step (a) in scheme 1. Yield 70%, white solid. 1 (1 (b). Prepared following general procedure step (a) in scheme 1. Yield 60%, white solid. 1 acid 1 (c). Compound was synthesized following the general procedure step (a) in scheme 1. Yield 99%, white solid. 1 (2S,4S)-di-tert-butyl 4-hydroxypyrrolidine-1,2-dicarboxylate 3 (a). Synthesized according to step (c) in scheme 1 general procedure. Yield 99%, white solid formed by trituration of colorless oil residue in DCM and hexanes. 1     atmosphere. The reaction was left to warm up to room temperature and stirred for 24 hours monitored by TLC. Reaction mixture was transferred into separating funnel and diluted with EtOAc and washed with 0.5M HCl (10 mL x3), NaHCO3 (10 mL x3) and brine (5 mL x3). The organic layer was dried over Na2SO4 and concentrated in vacuo. The residue was purified using silica gel flash chromatography using 0 -25% EtOAc in hexanes to obtain a pure colorless oil (37 mg, 23 %) and impure fraction 120 mg (not used). The pure fraction was used in the next step. 1

di-tert-butyl (2S,4S)-4-((4-(hydroxy(phenyl)methyl)benzoyl)oxy)pyrrolidine-1,2dicarboxylate 4 (i).
Synthesized according to the general procedure step (d) in scheme 1. 4 (h). (47 mg, 0.09 mmol) was dissolved in DCM (0.5 mL)/methanol 1.0 mL mixture and cooled to 0 0 C. NaBH4 (7.1 mg, 0.19 mmol) was added in one portion. The reaction let to warm up to room temperature and stirred for 2 hours until complete conversion as monitored by TLC. Reaction was quenched by addition of acetone and transferred into separating funnel and 5.0 mL of water added then product extracted with DCM (10 mL x3). Organic layer was dried over Na2SO4 and filtered off. Filtrate was concentrated in vacuo and residue purified using silica gel flash chromatography to yield 35 mg, 75% as a white solid. 1