Modeling for COVID-19 college reopening decisions: Cornell, a case study

Significance Decisions surrounding how to safely reopen universities directly impact 7% of the US population (students, staff) and indirectly impact tens of millions more (families, communities). After witnessing large COVID-19 outbreaks among students from August 2020 to the present, universities want to provide safety while minimizing social and financial costs, despite uncertainty about vaccine hesitancy, vaccine efficacy, more transmissible variants with the potential for immune escape, and community prevalence. When the Delta variant is dominant, we find substantial risk reduction in moving student populations from mostly (75%) to fully (100%) vaccinated, in testing vaccinated students once per week even when all students are vaccinated, and in more frequent testing targeted to the most social groups of students.

At the end of each simulated period, we allocate the symptomatic individuals to severity levels 2-4 with certain proportions. 139 These proportions are estimated from data as explained below. Once an individual is assigned to a severity level they remain 140 there; further transitions between severity levels are not modeled. 141 Let P (sev i) be the probability that, as a result of a single contact with an infected person, an individual becomes infected and falls within severity level i. Thus the sum of these probabilities over i = 1, 2, 3, 4 is the probability of infection as a result of a single contact. Then, the probabilities that as a result of a single contact an individual becomes infected and asymptomatic, respectively infected and symptomatic, are P (asymptomatic) = P (sev 1), and P (symptomatic) = P (sev 2) + P (sev 3) + P (sev 4).
We want to find P (sev i) for the population while considering age-based factors. Specifically, we model how the severity of the disease varies with age, and that older age groups are more likely to become infected after an interaction with an infectious person. To that end, P (sev i) = age P (sev i|infected, age)P (infected|age)P (age), where P (infected|age) = P (infected|contact, age)P (contact|age) ∝ P (infected|contact, age).
The proportionality in the second equation comes from the assumption of a homogeneous well-mixed population within each 142 group. Therefore, the distribution of the age of contacts is the distribution of the age of the population in the group.

Severity Calculation Part 1: Severity and Infection given Age
We obtain values for the probability of infection as a function of age 144 from (13), which reports the probability of infection through a close contact for different age groups among 4941 close contacts 145 traced from early cases in Guangzhou, China. These estimates are given in the first row of Table S2. 146 Later, we will estimate the age distribution (P (age)) for Cornell's fall semester.  The severity level distribution for each age stratum is estimated from a combination of data sources.

149
We first estimate P (sev 1|infected, age), the asymptomatic rate for each age group, as follows. 150 1. Fix the asymptomatic rate for the 75+ age group, P (sev 1|infected, age grp 5) to 13%. The 13% figure comes from (18), 151 where a nursing home in Seattle had 3 asymptomatic cases out of 23 confirmed cases. 152 2. To estimate the asymptomatic rate of the remaining four age groups, we attempt to match the following data points by 153 minimizing the sum of squared errors, subject to the (assumed) constraint that the asymptomatic rates decrease over age 154 groups 2 through 5.

155
(a) The CDC estimated that the population asymptomatic rate in the USA was 35% (Source: (11)). Weighting our 156 age-stratified asymptomatic rates by the age distribution for the US population we should obtain a value close to 2. Given that a patient is symptomatic, the probability they will be hospitalized is P (sev 3, 4|infected,age)/P (symptomatic|infected,age).
3. Given that a patient is hospitalized, the probability that they will be admitted to the ICU is P (sev 4|infected,age)/P (sev 3, 4|infected,age).
The CDC (11) estimates the symptomatic case hospitalization ratio to be 1.7% for age 0-49, 4.5% for age 50-64, and 7.5% for 171 ages 65+. The percent admitted to ICU among those hospitalized is 21.9% for age 0-49, 29.2% for age 50-64, and 29.8% for 172 ages 65+. We recognize that the age cutoffs are slightly different to ours. We match the CDC's estimates for age 0-49 to our 173 first two age groups, those for age 50-64 to our second age group, and those for 65+ to our fourth and fifth age groups. The 174 probabilities of severity levels 2, 3, 4 are calculated accordingly to fit these estimates.

Severity Calculation Part 2: Age Distribution
To complete our severity calculation, we first identify different groups on Cornell's 176 campus and estimate their distribution over the five age groups. The parameter values are given in Table S3.
177 Table S3. Information for different population groups on Cornell's campus. The size of each group as well as the faculty age distribution are provided by (24); the age distribution for academic professionals, staff, and students are assumed. For the Fall reopen, each of the 7 Cornell groups has an age distribution based on the table above. This age distribution 178 dictates the severity distribution for each group. We assume that the remaining group (Greater Ithaca) has the same age 179 distribution as the US population. An additional source of outside infections comes from students returning at the start of the fall semester, which we model 221 next.

E. Students Returning and Initial Prevalence.
In advance of the fall 2020 semester, New York state required all travellers from 223 high-prevalence states to self-quarantine for two weeks upon arrival. The list of high-prevalence states changed throughout 224 August 2020, in advance of the Fall Semester. Our analysis is based on New York State's list of High Prevalence states 225 on August 7, 2020. We model the return of students to campus in two phases: (1) a 14-day period when students from 226 high-prevalence states arrive and self-quarantine, followed by (2) move-in weekend when other students arrive.

227
The modeled student arrival process is summarized below.

228
• Some students get tested remotely and are isolated if positive. Others come without being tested. Students coming from 229 high-prevalence states are less likely to have test access at home.

230
• Students traveling to campus risk additional infection after being tested at home prior to departure (if they are tested) 231 and during travel.

232
• Students are required to be tested upon arrival as a condition for enrollment. Students are strongly encouraged to use the 233 first available testing date, though some will instead choose to be tested later. Positives are isolated, including some false 234 positives. If a student comes from a high-prevalence state, then the student is required to self-quarantine for 14 days.

235
• Some positive cases already exist on campus due to infections from the greater Ithaca area.

236
• Some positive cases among incoming students are missed because of false negatives and because some students are early 237 enough in their infection to not be PCR-detectable.

238
• These two sources of cases (existing and new) combine to create an on-campus prevalence.

239
• This on-campus prevalence creates additional cases on campus. Some additional cases are also created on campus due to 240 outside infections from the greater Ithaca area.

241
• During the two-week period before the move-in weekend, regular surveillance testing had not begun, but contact tracing 242 was underway. 243 E.1. 14-day self quarantine. Here we discuss the model for the arrival of students from high-prevalence states for which New York

244
State requires a mandatory 14-day self-quarantine. The students among these that have access to housing in which they can 245 self-quarantine are modeled as arriving in Ithaca two weeks before classes start. Other students in this group without such 246 housing are modeled as either choosing to start classes virtually or, in a few cases, coming to Ithaca without complying with 247 the required quarantine period in violation of state law. 248 Incoming Student Population Sizes: Student data suggested that roughly 33% of the undergraduate students and 23% of the 249 graduate / professional students have homes in states designated by New York State as "high prevalence" requiring mandatory 250 quarantine. 251 We assume that many such students with off-campus housing will spend the mandatory quarantine period in Ithaca in 252 that housing. For students that originally planned to be in on-campus housing, we assume that the majority will not come to Ithaca at the start of the semester but rather will begin the semester online; a small fraction will quarantine somewhere outside Ithaca and return during the move-in weekend; while another small fraction will fail to comply with the law, either using 255 non-compliant quarantine in shared housing in Ithaca, or by arriving during move-in weekend without having quarantined.

256
Assuming that 10% of continuing undergraduates and 75% of continuing graduate / professional students have stayed in Ithaca, 257 the total number of students arriving 2 weeks in advance from high prevalence states is 3750, including 2500 undergraduate 258 students and 1250 graduate / professional students. Assuming 31 confirmed cases, which is what was observed over the first 21 days of July 2020, that cases last 20 days, and 281 2x-4x underreporting in Tompkins County (less than elsewhere due to excellent testing access), gives 60 -120 active cases, or 282 0.075% -0.15% prevalence.

283
Interactions: During the two-week period before classes start, we assume no interaction between students and employees. 284 We use a multi-group simulation consisting of four groups -self-quarantined students, unquarantined students, employees, 285 and the greater Ithaca community -to model different behaviors (reflected by daily transmission rate) within and across the 286 groups. As noted elsewhere, we assume 40% compliance with quarantine requirements amongst self-quarantining students. The 287 transmission matrix for the self-quarantine period is summarized in Table S5.
288 Table S5. Inter-and intra-group transmissions per day during the self-quarantine period, based on the multi-group simulation, which use contacts from the literature, choose an infectivity calibrated to an estimate of R0, and then multiply to get transmission. Each entry gives the expected number of transmissions per day from one infected member of the row group to each of the column groups. under-reporting factor of 10 and an average active period of 20 days, this daily new positive case threshold translates to a 293 prevalence of 10 / 100,000 * 10 * 20 = 2%. Hence, the overall prevalence in student origins that are not designated as "high 294 prevalence" is at most 2%. This prevalence is prior to any testing at the origin prior to departure for Cornell.

295
Incoming Student Population Sizes: As discussed previously, in addition to students from low prevalence states we assume 296 that a small fraction of the students (300) from high-prevalence states that plan to live on-campus will return during the 297 move-in weekend. Although these students will have presumably self-quarantined for 14 days elsewhere, we pessimistically 298 assume non-compliance and consider their prevalence upon entering Ithaca to be 4%. Given it is a small population compared 299 to students from low-prevalence states (with prevalence < 2%), and the assumed under-reporting factor of 10 is large given the 300 access to testing in low-prevalence states at the time, we assume that the overall prevalence among students returning during 301 the move-in weekend is exactly 2%. We estimate the total number of students returning during the move-in weekend to be access. We assume that 2 3 of the students from low-prevalence states were tested at home, using nasopharyngeal (NP) sampling 305 (90% sensitivity). Hence, the fraction of returning students that are infectious is estimated to be 2% * (1 − 2 3 * 90%) = 0.8%.

306
In addition, we also assume a small per-day infection probability during travel. The travel duration and the likelihood that 307 students use public transportation (with an associated elevated daily infection probability) depends on the geographic origin of 308 students. Weighting these probabilities by geographic origin of students, we estimate that an additional 0.1% of the returning 309 students are infected during travel to campus. Among them, 45% are estimated to be in the Infectious state upon arrival 310 (which can be detected with probability 90%), and 55% are estimated to be in the Exposed state upon arrival (which cannot 311 be detected by arrival testing). Assuming arrival testing with NP sampling and 100% compliance, the fraction of returning 312 students that are infected and not identified by arrival testing is (0.8% + 45% * 0.1%) * 10% + 55% * 0.1% = 0.14%.

313
The initial prevalence estimates for the student groups combine the initial prevalence estimates from the 14-day simulation 314 (local students and self-quarantine of high-prevalence states) and move-in weekend (low prevalence state students) to reflect 315 the composition of each group. The initial prevalence of all the groups after arrival and immediately prior to the semester is 316 summarized in Table S6. H. Transmission. Transmission within and between each group is governed by the "transmission rate matrix." This is estimated first by estimating a rate of contacts within and between each group, and then calibrating the transmission probability per 347 contact to a value of R0. There is a transmission rate matrix for summer of 2020 to model the pre-semester period and a 348 transmission matrix for fall of 2020.

349
The term "contact" is used consistently with the literature, where a contact is defined as a two-way conversation or a 350 physical interaction (e.g., a kiss or handshake) (30). Thus, it includes those contacts that are more brief than the CDC's 351 definition of a close contact (6 feet or less and 15 minutes or more). 352 We now describe our estimation methodology for both the summer 2020 and fall 2020 matrices in an algorithmic manner.  pre-social-distancing calculation (which is calibrated to R0 = 2.5).

379
• Literature also suggests that younger people are less likely to abide by social distancing regulations (31). Therefore 380 we will assume that the impact (multiplier) of social distancing is 50% less effective for students during the summer.

382
• Using this estimate and the following additional assumptions, we can create an estimate of the summer transmission 397 * Graduate students will have 75% of contacts, and thus transmissions, be Cornell-related. About 25% of these Cornell contacts are with faculty/staff and all others are with grad students. The majority of the 399 contact with faculty/staff is with people that will be on campus and student-facing in the fall.

400
-Symmetry condition for daily transmissions: The expected daily transmissions between group 1 and group 2 is 401 the expected daily transmissions between group 2 and group 1. Therefore, selecting the daily transmission rate 402 per person in group 1 with group 2 determines the daily transmission rate of someone in group 2 with group 1.

403
6. This results in the summer transmission rate matrix in of July when prevalence nationwide rose, but gradually declined after. If R0 were bigger than 1 in Tompkins County, then we 435 would expect that new cases would grow exponentially. The fact that this did not happen suggests that R0 < 1.

436
Second, the TCHD reports that 16 out of 31 cases between July 1 and July 21 had relevant travel to a high-prevalence 437 region. Let us make the following assumptions:

438
• Assume reporting bias is the same for both individuals infected locally and infected due to travel.

439
• Assume that all of these cases between July 1 and July 21 resulted from clusters initiated by external travel that happened 440 in July, predominantly July 4, and not from clusters that were present in Tompkins County before July. This is based on 441 the observation that prevalence in June in Tompkins County was very low. Also, if one assumes that some local July 442 cases began due to pre-existing clusters then this will cause our R0 estimate to decrease further.

443
• Let us momentarily assume that all clusters initiated by July travel concluded by July 21. This assumption is too 444 optimistic, and will create an R0 estimate that is too low -we will adjust for this in a moment.

445
In general, in a large fully susceptible population with R0 < 1, each new case creates a cluster that infects 1 + R0 Cornell re-open scenario: the group sizes (Table S7) and the transmission rate matrix (Table S9).  Table S10 gives the population size for each group for virtual instruction. We assume that the last three columns -  To estimate the population sizes for the student groups, we used results from a survey sent out on May 29, 2020 to all 465 students enrolled at the time, while attending to two concerns:

466
(a) Not all of the students who received the survey responded. For the first concern, since 71% of the undergraduates and 48% of the graduates responded, we assume these percentages 470 generalize to the whole population. For the second concern, we will explain group by group how we handle it. non-first-year students who return to Ithaca from each of the GS class and GS research groups.

490
-For the first-year graduate students in each group, we assume that the 53% likely-to-return proportion is reduced 491 by a further 90% in the case of class-based students, and 70% in the case of research-based grad students. This The transmission rates for virtual instruction are based on the transmission rates for residential instruction with some 500 adjustments. As a reminder, transmission rate = contacts / day * 1.8% infectivity rate, and we assume that the interaction 501 between faculty/staff within themselves and with the Greater Ithaca community does not depend on scenarios. The 502 main idea for estimating transmission rates for virtual instruction is that class-based students would interact less with 503 faculty and staff, but more with the Greater Ithaca community. Student interactions among themselves depend on their 504 compliance with the behavioral compact (e.g., mask-wearing and social distancing) and housing density in Collegetown. 505 We explain each of the transmission rates we have re-calculated below.

506
• UG high 507 -Since we assume no one in "UG high" will return, there is no transmission from this group to others.

508
• UG low / GS class within-group 509 -The virtual scenario has two competing effects: reduced density of transmissions due to fewer people on campus, 510 and potential increase in transmissions due to Cornell's reduced ability to enforce mask wearing, social gathering 511 restrictions, and abundant asymptomatic testing.

512
-First, we discuss the effect of social gathering and mask wearing. In the residential instruction scenario, 513 we assumed that Cornell's ability to legally mandate mask wearing and social gathering restrictions with a 514 behavioral compact resulted in a 30% reduction in transmission between pre-social-distancing periods and 515 a residential fall semester. Under virtual instruction, since Cornell will not be able to enforce a behavioral 516 compact, we assume that this reduction in transmission (between the summer and a virtual fall semester) will 517 be less than between the summer and residential instruction. While one might imagine that there would be no 518 reduction in transmission between the summer and a virtual instruction fall given Cornell's reduced ability to 519 enforce a behavioral compact, we optimistically assume a 15% reduction. This has the effect of increasing the 520 within-group transmission rates of "UG low" and "GS class" by a factor of (1-15%)/(1-30%) from the residential 521 setting.

522
-Second, Section 3.1 and Figure 4 of (33) suggest that the mortality rate of infectious diseases rises with 523 population density up until population density reaches 200 people per square mile and then levels off. Below, 524 we estimate that virtual instruction reduces the population density to roughly 2000 / square mile from roughly 525 6000 / square mile under residential instruction. Although the literature thus suggests that there will be no 526 reduction in transmissions due to virtual instruction relative to residential instruction, we conservatively assume 527 that virtual instruction will result in a reduction of transmissions by 20%.

534
-Combining the two effects described above, we multiply the residential within-group transmission rate for "UG 535 low" and "GS class" by a factor of (1-15%)/(1-30%) * (1-20%) = 0.9712 • UG low / GS class with faculty, staff and graduate students: 537 -"UG low" and "GS class" will have much less interaction with "FS student" and "FS not student" since they do 538 not need to see any professors in person. Thus, we assume the transmissions from any "UG low" and "GS class" 539 person to on-campus faculty will drop to minimal to be the same as transmissions to any off-campus faculty.

540
• UG low / GS class with Greater Ithaca:

541
-A virtual semester that shuts down the campus including the dining halls would increase undergraduate 542 interaction with Greater Ithaca for reasons like groceries and other necessary activities. However, "UG low" 543 and "GS class" are unlikely to leave the Collegetown area very frequently. Therefore, a number larger than the 544 transmission rate for UG low / GS class with Greater Ithaca in the residential instruction scenario but less than 545 that of GS research would be a reasonable estimate. Thus, we set the transmission rate for UG low / GS class 546 with Greater Ithaca to be a little over half of that of GS research with Greater Ithaca, the figures of which do 547 not change from scenario to scenario.

548
In summary, Table S11 gives the virtual instruction transmission matrix. • Data in a publicly available report pursuant to the urgent public health need presented by the pandemic (4).

591
The data sources for all parameters are summarized in Table S12.
592 Table S12. Data sources of parameter estimates/calibration for the fall 2020 and spring 2021 semesters. "V" indicates that the data is obtained from the HIPAA-compliant database; "P" indicates that data is obtained from the publicly available report. • Group 3: graduate or professional students. 597 We employ this population breakdown because we observe substantial differences in infections and contacts for these 598 specific subgroups. We set August 16, 2020 -November 24, 2020 to be the time period for our calibration, as the majority of 599 undergraduates left the greater Ithaca area at the time of the Thanksgiving holiday. We divide the time horizon into two 600 non-overlapping periods: the pre-semester period (8/16/2020 -9/2/2020) and the in-semester period (9/3/2020 -11/24/2020).

601
Here we describe the parameters estimated directly from fall 2020 data.

Population Size
We use students' degree program information, Greek-life affiliation and varsity athlete rosters, and daily 603 check-in data to divide students residing in Ithaca into 3 subgroups, obtaining the population sizes given by Table S13. 604   Table S13. Sizes of the three student groups used in fall 2020 student calibration and projection.

Arrival Schedule
Arriving schedules for groups 1, 2 and 3 are determined based on the arrival dates indicated by students in 605 their Fall semester checklist, and the move-in schedule for students living on campus.

Testing Frequency
The model does not track individuals and their test schedules. Rather, each member of a population is 607 assumed to test on a given day with a given probability.

608
• Pre-semester period: The testing frequency during the in-semester period is consistent with the testing frequency for students assumed in the 618 main simulation model.

Test Compliance
We estimate student test compliance to be 97.4%. This value is calculated based on the fraction of scheduled 620 student surveillance tests completed over the course of the fall semester (including both on-time tests and those tests that were 621 delayed but completed).

Outside Infection Rate
We consider a positive student case to be an outside infection if they satisfied both of the following 623 conditions:

624
• they did not test positive in an adaptive test, nor were they in the contact trace of other positive cases;

625
• they had recent travel history;

626
• they are not classified as an "arrival positive" case.

627
Table S14 summarizes the number of outside infections in each group during the semester and the corresponding outside 628 infection rate, which is the number of outside infections divided by (population size of the group × time horizon in days). Note that the period considered does not include the post-Thanksgiving period. During the post-Thanksgiving period, 630 graduate students tested positive at a higher rate due to travel.

Contact Matrix
We define the daily transmission matrix T such that the value T (i, j) gives, for each infectious non-isolated  To estimate the contact matrix, we make the following additional assumptions:

641
• Each case identified through adaptive testing was generated by a source case in the same group.

642
• All positives in the student population created by an infectious student are identified as a close contact of that student 643 (even if they were originally identified and tested because of surveillance testing, symptoms, or adaptive testing). 644 We first classify the student positives in the in-person semester (184 cases between 8/16/2020 and 11/24/2020) into source 645 cases and secondary cases. Here, "secondary cases" include those identified via contact tracing or adaptive testing. The 646 remaining cases, identified through surveillance testing, symptomatic self-reporting, arrival testing, or testing positive after 647 returning from travel, are classified as source cases.
• We begin by identifying all positive cases in each group i. Let this be N (i).
• For each group j, we count the number of positives in group j that were identified as a close contact of a person in group 651 i. Let this be L(i, j). A positive who is a close contact of people in multiple groups is counted proportionally to the 652 groups of the people that identified them as contacts. For example, a positive person in group 2 who is identified as a 653 close contact of one person in group 1 and two people in group 2 would contribute 1 3 to L(1, 2) and 2 3 to L(2, 2).

654
• For each group j, we additionally count the number of positive people that were identified through adaptive testing but 655 were not identified as a close contact. In an abuse of notation, let this be L(j).

656
• The value of M (i, j) is then (L(i, j) + 1{j = i}L(j))/N (i). 657 We use identified contacts in producing these estimates. When contacts are not identified, this could distort the estimates.

658
Assuming that contact tracing is equally effective for all source groups and "destination" groups, thus resulting in a constant 659 fraction of contacts missed, the fact that we only use the matrix up to a multiplicative proportionality constant should ensure 660 that the resulting error is controlled. The resulting contact matrix M is shown in Table S15. post-exposure pre-convalescence infectious period and the probability that an infected person is in the exposed state and thus not identifiable by a PCR test (estimated to be 0.2 based on state occupancy times in our model). Hence, for any positive case arriving in Ithaca, the probability that it is not identified by the arrival test is P (exposed state) + P (not in exposed state) ·

695
(1 − sensitivity) = 0.2 + 0.8 × 0. Third, we estimate the number of secondary cases resulting from the arrival positives, due to the fact that students did not 703 take their arrival test right upon arrival and hence could infect other students during the testing delay. This is obtained based 704 on the contact matrix M (as described above), assuming that each arrival positive in group j infects M (i, j) individuals in 705 group i. 706 We summarize the number of secondary cases in each group below: 707 • Group 1: 5 × 92/125 + 6 × 1/44 = 3.82;

709
In summary, the average number of initial cases in groups 1 and 2 are given below: For group 3, since we did not observe its first positive case after 8/16/2020 until 9/12/2020, we set the initial prevalence to 713 be zero.

Calibration Results
We calibrate our model's projected infections to the actual trajectory within 3 subgroups from 8/16/2020 -715 11/24/2020, as shown below. The total number of positive cases observed within the time period is described below and the 716 trajectories are described in Figure S2. Here we tune the parameter α in the simulation, i.e., the proportionality constant described in the contact matrix section 721 above. For each value of α, we compute the mean squared error of the simulated results described as follows:

722
• Let sim(t, i, j) denote the number of infections on day t in replication i for group j according to the simulation.

723
• Let actual(t, j) denote the number of infections observed on day t for group j.

724
• Then, the error score associated with α is given by where N is the number of simulation replications and T is the simulation horizon.
725 Figure S3 shows the log root mean-squared error of our model predictions versus α. We see that when α = 0.525, the lowest 726 error score is obtained.  August 16, 2020 and January 10, 2021.

732
We have access to less detailed data about employees compared with students. In particular, we do not have access to 733 contact tracing data for the fall semester. Understanding the difficulties of estimating inter-group transmission rates given 734 a lack of contact tracing data, we elect not to partition the employee group (partitions considered included those based on 735 county of residence or job type).

736
Observing rising infection counts among faculty and staff after Thanksgiving, we decide to include December and early 737 January in the period of interest. We divide the time horizon into two non-overlapping periods: the pre-semester period 738 (8/16/2020 -9/2/2020) and the period after (9/3/2020 -1/10/2021).

739
In place of contact tracing data, we leverage "cluster_ids" that were generated from manual review of employee cases.

740
An employee case is assigned a cluster_id if that case is believed to be linked to at least one other case at Cornell, with all 741 linked cases being assigned the same cluster_id. The use of the term "cluster" is perhaps misleading, since even pairs of

777
We choose to use 0.34 × 0.75 = 0.255, assuming that 75% of the people found in clusters among Cornell employees were 778 found via contact tracing or adaptive testing, with another 25% found via symptomatic self-reporting or surveillance testing.

779
This is based, in part, on the observation that a large fraction of Cornell employee clusters are among family members and 780 these would almost always be found via contact tracing. We assume that it is rare for positives in the Cornell community to be 781 found via contact tracing of people who are not part of the Cornell community.

783
Outcomes are insensitive to the parameter cases_quarantined_per_cluster, which determines the number of negative 784 individuals quarantined, because its only effect on infections is to reduce the number of susceptible people that can be infected.

785
Given that the fraction of the population quarantined is small, it has little effect on outcomes over several orders of magnitude.

786
Because information about employee quarantines was unavailable, we set it to 2.5, a value close to half of the value observed 787 for students, since employees were observed to have fewer contacts than students.

Calibration Results
We calibrate our model's projected infections to the actual trajectory from 8/16/2020 -1/10/2021, which 789 is shown in Figure S5. We then plot the log root mean-squared error (RMSE) between the observed trajectory and the average output of the simulation, versus the parameter we wish to calibrate, which is the daily transmission rate (# of other Cornell employees infected per day by a positive Cornell employee). Here, analogous to the error function used in the calibration for the student groups, the mean squared error is given by where sim(t, i) is the number of infections on day t in replication i according to the simulation, actual(t) is the number of 791 infections observed on day t, N is the number of simulation replications, and T is the simulation horizon. Note that many of 792 these infections occurred between family members who are both Cornell employees but infected each other at home. 793 Figure S6 shows the log RMSE versus employee transmission rate. We see in this figure that when the daily transmission 794 rate is 0.11, the lowest log error is obtained. Thus, according to our calibrated model, each infectious employee infects 0.11 795 other employees on each day they are infectious.    • Group 4: graduate or professional students, non MBA.
We employ this population breakdown because we observe substantial differences in infections and contacts for these specific 807 subgroups. In particular, the population breakdown is different from that we used in the fall 2020 semester because we observe 808 higher transmission rate and lower test compliance rate in the MBA student group. We set January 21, 2021 -May 25, 2021 to 809 be the time period for our calibration. We divide the time horizon into two non-overlapping periods: the pre-semester arrival 810 period (1/21/2021 -2/7/2021) and the in-semester period (2/8/2021 -5/25/2021).

811
Here we describe the parameters estimated directly from spring 2021 data. We estimate many of these parameters using the 812 same methodology described in Section A above, for which we directly report the results.

Population Size
We use the same methodology as in the fall 2020 calibration to obtain the population sizes, given by Table S16. 814   Table S16. Sizes of the three student groups used in the spring 2021 student calibration and projection.

Testing Frequency
As is in the fall 2020 calibration, the model does not track individuals and their test schedules. Rather, 815 each member of a population is assumed to test on a given day with a given probability.

Outside Infection Rate
We use the same methodology as in the fall 2020 calibration to obtain the outside infection rate, given 820 by Table S18.

Contact Matrix
Recall that the (i, j) entry of a contact matrix is the average number of positive cases in group j that an individual circulates for is inversely proportional to his/her test frequency. That is, the more frequently an individual is tested, 827 the less time he/she has to generate secondary infections. This assumption is necessitated due to the heterogeneity in testing 828 frequencies across different groups in spring, and the fact that unlike in fall 2020, a significant fraction of the cases occurred 829 among the graduate and professional students (Group 3 and Group 4) in spring 2021. 830 We outline the steps of adjusting for the different testing frequencies in different groups when computing the contact matrix 831 for spring 2021 calibration. Such adjustment would have minimal effect on the fall 2020 contact matrix, because infections 832 were concentrated in the undergraduate population (Group 1 and Group 2) who were tested 2x/week.

833
• Under the assumption above, those tested 3x/week had 1/3 less circulation time on average than those tested 2x/week, 834 while those tested 1x/week had twice circulation time on average as those tested 2x/week.

835
• Using 2x/week testing as a baseline, we adjust the number of positive contacts of cases in groups tested 3x or 1x/week so 836 that they reflect the number of positive contacts over the same circulation time as those tested 2x/week.

837
In particular, we multiply the number of positive contacts by 1.5 for source cases in Group 1 (tested 3x/week throughout the semester) and by 0.5 for source cases in Group 4 (tested 1x/week throughout the semester). Students in Group 3 were 839 tested 1x/week on or before 3/26/2021 and 2x/week after 3/26/2021, so the number of their positive contacts is scaled by 0.5 840 in the first period and unscaled in the second period. The resulting adjusted contact matrix M for the spring 2021 semester is 841 shown in Table S19.

Contact Tracing Effectiveness Parameters
We use the same methodology as in the fall 2020 calibration to estimate the contact 843 tracing effectiveness parameters, given below:

Initial Prevalence
We first summarize the estimated average number of initial cases in each student group: These initial cases are assumed to spread uniformly over the pre-semester arrival period in our simulation.

852
Below we describe in detail how we derive the average number of initial cases in each group, with results summarized in 853   Table S20. We use a slightly different methodology than that in the fall 2020 calibration because unlike fall 2020, arrival testing 854 was carefully planned at the beginning of the spring 2021 semester as part of the arrival process. In addition, a significant 855 fraction of the student population stayed in Ithaca during the winter break and took regular surveillance testing.

856
To model different behavior among students in taking their arrival tests, we partition students in each group into two 857 categories: (1) those arriving from outside Ithaca and getting tested promptly upon arrival; (2) those arriving from outside 858 Ithaca but not getting tested promptly, or those staying in the Ithaca over the winter break, taking regular surveillance testing 859 but exempt from arrival tests. (See Table S20, col.a and col.d for the sizes of each category in each student group.) 860 We consider the initial free and infectious cases at the beginning of the simulation to be the union of 861 • those imported positive cases that received their first test promptly upon arrival but were missed by the arrival test 862 (these cases belong to the first category);

863
• those cases imported to the Ithaca community but not tested promptly upon arrival, and those local infections during 864 the arrival period of the simulation (these two kinds of cases belong to the second category).

865
First, we estimate the number of initial cases (Table S20, col.c) that belong to the first category. We infer the number of 866 cases promptly tested upon arrival but missed by arrival tests from the observed arrival positive cases (Table S20,  Then, based on the same methodology as in the fall 2020 calibration but assuming instead that the sensitivity of the arrival 874 testing is 60% for AN sampling, we estimate that for every arrival positive case, there are 1.08 free and infected cases acting as 875 the initial cases in the simulation. We assume that the arrival positive cases do not result in any secondary cases because they 876 completed their first test upon arrival promptly. Second, we estimate the number of initial cases that fall into the second category (Table S20, col.f). To do that, we calculate the prevalence level (including positive cases that were captured OR missed by arrival testing) among students that belong 879 to the first category in each student group (Table S20, col.e). Then, assuming no heterogeneity in prevalence across the two 880 categories of students, we estimate the number of positive cases among students in the second category by taking the product 881 of the same prevalence estimate and the number of students in the second category. All of these positive cases in the second 882 category are assumed to be part of the initial cases in the simulation.

883
The average number of initial cases in each student group (

Calibration Results
We calibrate our model's projected infections to the actual trajectory within 4 subgroups from 1/21/2021 -886 5/25/2021, as shown below. The total number of positive cases observed within the time period is described below and the 887 trajectories are described in Figure S8. As in the fall 2020 calibration, we tune the proportionality constant α so that it minimizes the log root mean-squared error 893 of our model predictions.
894 Figure S9 shows the log root mean-squared error of our model predictions versus α. We see that when α = 0.8, the lowest 895 score is obtained.

902
Here we describe the parameters estimated directly from spring 2021 data in the model calibration for employee group.

Outside Infection Rate
We use the same methodology as in the fall 2020 calibration to estimate outside infection rate. We

Contact Tracing Effectiveness Parameters
We use the same methodology as in the fall 2020 calibration and obtain the estimate 910 of the contact tracing effectiveness parameter cases_isolated_per_cluster = 0.035.

Initial Prevalence
We use an initial prevalence of zero among employees at the beginning of the spring 2021 semester. Although 912 this likely underestimates the actual initial prevalence, we argue that the actual prevalence is low because employees took regular surveillance tests (on average, 1x/week) even during the winter break. In addition, our outside infection rate estimates help to capture the cases imported to the Cornell community.

Calibration Results
We calibrate our model's projected infections to the actual trajectory from 1/21/2021 -5/25/2021, which 916 is shown in Figure S11. As in the fall 2020 calibration, we tune the daily transmission rate so that it minimizes the log root mean-squared error of 918 our model predictions.
919 Figure S12 shows the log RMSE versus employee transmission rate. We see in this figure that when the daily transmission 920 rate is zero, the lowest error is obtained. This is expected because most of the employee cases in the spring 2021 semester are

928
This section presents our methodology for quantifying the effects of uncertainty in model parameters and additional results 929 from applying this methodology not presented in the main paper. 930 We are specifically interested in the effect of parameter uncertainty on two outcomes: the safety of a residential semester as 931 measured by the number of cases; and the relative safety of a residential semester compared to a virtual one, as measured by 932 the difference in infections between these two instruction modes (residential infections -virtual infections). For both outcomes, 933 a larger value is worse.

934
To quantify these effects, we perform the following steps:  Table S21.   Then, for each q, select as representative the configuration in this set with the largest density under the prior.

945
A. Parameter Scenarios. We adapt ideas from robust optimization (34) to address parameter uncertainty, with the goal of 946 identifying and understanding the worst possible outcome over the parameter configurations.

947
We begin by constructing an uncertainty set derived from reasonable ranges for each parameter (see the "lower bound" and 948 "upper bound" columns in Table S21). These ranges induce a natural central point in the space of parameter configurations, 949 where each parameter takes the value at the midpoint of its range. We place a joint multivariate normal prior with independent 950 marginals on the parameters with mean at the central point. We assume the range for each parameter given in Table S21 is a 951 symmetric 95% credible interval, i.e., the true parameter value lies in this interval with 95% probability.

952
The multivariate normal prior is used primarily for simplicity. We require a unimodal distribution with elliptical contours, 953 the latter property of which permits straightforward calculation of pessimistic scenarios below. One could use other distributions 954 with such contours, e.g., a multivariate t distribution. We chose the normal for simplicity and because the "core" of the prior 955 distribution where most of the probability is concentrated, drives much of the analysis, which we believe is well captured 956 through the normal prior. With a multivariate t with similar mean and spread we believe the outcomes would not have been 957 substantially different. With regard to the issue about parameters potentially falling outside of meaningful ranges, such as the 958 issue of non-negativity, we use rejection sampling to ensure that all sampled parameters fall within their meaningful range.

959
Hence, the actual prior is a truncated multivariate normal distribution. Still, the exact form of the prior is arguable. To be 960 more precise in what follows, we define the following notation:

961
• x ∈ R n : vector of parameters; n = 12 for the residential case and n = 16 for the residential-virtual case.

962
• LBi, U Bi: lower and upper bound of parameter i, as specified in Table S21. By assumption, (LBi, U Bi) is a symmetric 963 95% credible interval for parameter i and parameters are mutually independent under the prior. Based on survey results of students. Lower bound is from proportions that replied "very or somewhat likely to return", upper bound students who answered "it depends / undecided". Next we consider the development of the linear approximations. As described above, we are interested in two outcomes.

970
The first outcome is the number of cases in a residential semester. The second outcome is the number of residential infections 971 minus the number of virtual instructions. In both settings, the outcome is worse at larger values.

972
To estimate the outcome over the parameter space, we sample 2000 parameter vectors using Latin hypercube sampling over  Tables S22 and S23.

977
To summarize uncertainty, we develop a one-dimensional family of parameter configurations associated with increasingly 978 pessimistic outcomes. For each y ∈ R, we consider the set A(y) of parameter configurations whose expected outcome under 979 the fitted linear model is equal to y. By construction, such sets {A(y), y ∈ R} are hyperplanes normal to c and partition the 980 parameter space into two half-spaces. We find y * such that the associated half-space, over which the expected outcome under 981 the linear model is less than or equal to y * , contains a prior probability mass of 0.99. We then determine the pessimistic 982 configuration by selecting the representative point in A(y * ) with the highest prior density. Figure S14 provides a visualization 983 of this setup that may prove helpful in interpreting the following more precise explanation of how we summarize uncertainty.

984
For any y ∈ R, the set of parameter configurations with expected outcome equal to y under the linear model is the hyperplane Consider the half-space defined by this hyperplane over which the expected outcome under the linear model is less than or equal to y, {x : c0 + c T x ≤ y}. Let q(y) = P (c0 + c T X ≤ y) be the prior probability mass in this half space, where X ∼ N (µ, Σ). Let y * be such that q(y * ) = 0.99, so that y * is such that the median outcome is less than y * with prior probability 0.99. To find y * , recall that X ∼ N (µ, Σ), so c0 + c T X ∼ N (c0 + c T µ, c T Σc), and then Let x(y * ) be the point with the largest prior density in the hyperplane A(y * ). We claim that x(y * ) is the unique point in A(y * ) lying on the line through µ in the direction Σc, that is x(y * ) ∈ {µ + λΣc : λ ∈ R}. Why? Maximizing the density 986 over A(y * ) is equivalent to minimizing the quantity (x − µ) T Σ −1 (x − µ) over all x ∈ A(y * ), i.e., over all points x satisfying 987 c0 + c T x = y * . To find the optimum, define the Lagrangian L( is characterized by the equation ∇xL(x; η) = 0, for some Lagrange multiplier η ∈ R. The gradient of the Lagrangian is 989 ∇xL(x; η) = 2Σ −1 (x − µ) − ηc, so the optimal point is given by x(y * ) = µ + η 2 Σc, which is on the line through µ in the direction 990 Σc as originally claimed. 991 We can thus find x(y * ) as the unique point in the intersection of the hyperplane A(y * ) and the ray {µ + λΣc : λ ∈ R}. We 992 find that λ = Φ −1 (0.99)/ √ c T Σc, and so the pessimistic point is given by [2] 994 We follow this same approach to create a range of parameter scenarios with varying levels of pessimism q ∈ (0, 1), by substituting q for 0.99 in the derivation above. We refer to the resulting parameter scenario as the q-quantile pessimistic point, and (in an abuse of notation) denote it as x(q). The expression for this parameter scenario is obtained by substituting q for 0.99 in Equation 2: This parameter scenario is such that the prior probability is q of seeing a parameter configuration with fewer infections than 995 x(q), assuming that the simulator's response is given by the fitted linear model. In this section, we support the claim that the true pessimism level of a q-pessimistic scenario actually is typically close to q, 1005 justifying their use. 1006 We focus on the residential outcome, where there are 12 parameters. For each parameter configuration x(q) with q ranging 1007 over q ∈ {0.01, 0.05, 0.1, · · · , 0.9, 0.95, 0.99}, we run 20 simulation replications and record the median number of infections in a 1008 residential semester, which we denote asŷ * (q).

1009
Then, to estimate true pessimism levels, we sample N = 1000 parameter configurations from the 12-dimensional multivariate Next, for each q, we use Y to estimate the true pessimism level of x(q), denoted by r(q). This is the fraction of outcomes 1015 (median infections) in Y that are smaller than the outcome at x(q). Mathematically, r(q) = |{yi ∈ Y : yi ≤ŷ * (q)}|/N , where 1016 | · | denotes the cardinality of a set. If r(q) is close to q, then the pessimism level q claimed relying on the linear model is close 1017 to the actual pessimism level r(q) of the resulting scenario.
1018 Figure S16 shows the estimated values of r(q) vs. q. For all values of q evaluated, the deviation of r(q) from q is within a 1019 small range. In particular, r(q) and q match each other well for q close to 1. This demonstrates that the true pessimism level  For each pessimism level q, a model-free simulation-based estimate of the probability r(q) under the prior of having a parameter configuration whose median number of infections is worse than the median number of infections under the pessimistic scenario x(q) with pessimism level q. The scenario x(q) assumes that the simulator's response is linear in the parameters and so r(q) may differ from q. The data pictured here suggests that r(q) is close to q despite non-linearities in the simulator's response to parameters. Estimates of r(q) were calculated at q = 0.01, 0.05, 0.1, · · · , 0.9, 0.95, 0.99. D. Comparison of Prior to Calibrated Outcomes. Table S25 summarizes key parameter differences between fall 2020 nominal, fall 2020 pessimistic, summer 2020 nominal and calibrated fall 2020 scenarios. The calibrated fall 2020 scenario includes

F. Correlation of Infection and Hospitalization metrics.
In this section, we present graphs that demonstrate that the simulated 1061 number of Cornell infections is positively correlated with the number of Ithaca infections and Cornell and Ithaca hospitalizations.

1062
Due to this correlation, we use the number of Cornell infections as our primary metric.

1063
In Figure S20, each point corresponds to a parameter vector sampled from the prior described earlier in this section.

Bayesian Analysis for Fall 2021 Projections
This section describes how we use our model to explore the interventions needed in the fall 2021 semester. We leverage 1066 information gathered from fall 2020 to the present and adjust for changes such as the Delta variant and vaccination level.
Then, we perform a Bayesian analysis on the key uncertain parameters. To obtain the prior, we place ranges on each of these Assuming that reduction in transmission above results from social distancing interventions, we can estimate that loosening 1118 social distancing interventions leads to an increase in contacts by a factor of 2.5 / 0.9 = 2.7. This estimate is considered to be 1119 the upper bound for the close contact multiplier because it ignores the effects of other interventions such as contact tracing in 1120 reducing transmission and therefore overestimates the elevation in contacts due to loosening social distancing interventions.
Then, we estimate the log-likelihood of the observed trajectory under parameter configuration θ using (θ) = 5 t=1 log (p(y(t); m(t, θ), s(t, θ))) , where p is the density of a log-normal random variable with the given parameters.

Approximating the Posterior
The log posterior density of parameter configuration θ is given by 1171 log(π(θ|y)) = (θ) + log(π(θ)) − log(Z), [3] 1172 where y = {y(t)} is the full observed trajectory, π(θ) is the prior for θ and Z is the normalization constant for the posterior where g is the gradient of the log posterior at θ * (the gradient is nonzero because θ * is not the true maximizer) and H is the negative Hessian of the log posterior at θ * .
For the regression, we select a subset of the sampled configurations that are close to θ * , for which the second-order Taylor 1230 approximation at θ * (Equation 5) holds reasonably well. We outline the steps of performing the regression:

1233
• Next, we develop a distance metric to select points close to θ * . We notice that the components of di have vastly different 1234 scales. As a result, the L2 norm di 2 is dominated by a few components, which may bias the selection. Thus, we 1235 standardize each parameter component to a standard deviation of 1.

1236
• Based on the norms of the standardized distance vectors { d i 2}, we select a fraction q of S that are closest to θ * to be 1237 included in the regression, denoted by J .  Givenĝ andĤ, the posterior distribution of the parameters is approximately multivariate normal with meanθ * = θ * + (Ĥ) −1ĝ 1242 and covarianceΣ * =Ĥ −1 .

1243
Among the 11,301 sampled parameter configurations, we find θ * and run the quadratic regression on q = 1% points with the 1244 smallest standardized distance to θ * . where parameters are in the order of outside infection rate multiplier, contact tracing effectiveness multiplier, initial prevalence, To understand the interaction between different parameter components, we compute the correlation matrix. Let diag(Σ * ) denote the diagonal matrix with ith diagonal element equal to the (i, i) entry ofΣ * . The correlation matrix is given by We now compare the outcomes from the two approaches for approximating the posterior. We observe that the posterior 1249 marginal distributions from the empirical approximation (as presented in the main text) are consistent with the credible 1250 intervals from the regression-based analysis. The former observed at most weak correlation between the parameters (Fig S21), 1251 apart from the negative correlation between the constituents of the combined spread multiplier as expected. However, the latter 1252 estimated the correlations to be nontrivial between most pairs. We acknowledge an important limitation of the regression-based 1253 analysis: the correlation estimates are sensitive to the statistical fit, yet our ability to precisely estimate the derivatives of the 1254 log-likelihood is limited. It is plausible that the correlations between parameters are lower in reality.   33. H Hu, K Nigmatulina, P Eckhoff, The scaling of contact rates with population density for the infectious disease models.