Real-time pandemic surveillance using hospital admissions and mobility data

Significance Forecasting COVID-19 healthcare demand has been hindered by poor data throughout the pandemic. We introduce a robust model for predicting COVID-19 transmission and hospitalizations based on COVID-19 hospital admissions and cell phone mobility data. This approach was developed by a municipal COVID-19 task force in Austin, TX, which includes civic leaders, public health officials, healthcare executives, and scientists. The model was incorporated into a dashboard providing daily healthcare forecasts that have raised public awareness, guided the city’s staged alert system to prevent unmanageable ICU surges, and triggered the launch of an alternative care site to accommodate hospital overflow.

have shifted. Accounting for biases in our observational processes is critical to providing reliable situational awareness, investigating pandemic drivers and risks, and accurate forecasting. Case counts and test positivity can indicate changing risks, but are often biased by geographic and temporal variation in testing effort and priorities (27)(28)(29)(30). For example, when COVID-19 antigen tests were initially distributed for proactive screening in schools and long-term care facilities, some states reported the combined antigen and PCR test results, while others did not (31). While COVID-19 mortality counts are likely underreported (32), they are a high priority outcome of interest for national forecasting efforts (33) and may provide the most accurate but substantially delayed signal of past transmission (34,35). Often, case and mortality counts are analyzed jointly to reduce both delays and biases (4,34,36). COVID-19 healthcare data including hospital admissions, census, and ICU usage offer the fidelity of mortality data with a shorter lag, while also providing an immediate indication of healthcare resource needs. For example, COVID-19 hospitalizations have been used to estimate the impact of nonpharmaceutical interventions (37), provide healthcare demand forecasts (38)(39)(40)(41), and guide mitigation policies (42,43). However, such data can be biased by shifting demographics of COVID-19 patients, changes in admission criteria during surges, and the availability of post-acute care facilities (44,45). The municipal COVID-19 task force in the City of Austin, TX, developed a COVID-19 healthcare forecasting model that has guided regional pandemic responses since April 2020. The model is designed to provide robust, accessible, and holistic information about the changing pandemic situation. Using comprehensive COVID-19 hospital admissions and discharge data as well as cell phone GPS traces, the model estimates the impact of past policies and community behavior, real-time prevalence and transmission risks, and future COVID-19 hospitalizations and ICU needs. Here, we motivate our use of hospital admissions data by comparing the timeliness and fidelity of alternative indicators and then apply the model to characterize the first year of the COVID-19 pandemic in terms of the daily SARS-CoV-2 prevalence, transmission rate, case detection rate, and correlation between mobility and transmission. We then examine the impact of key policy and behavior shifts on these trends and retrospectively assess the performance of our 3-wk-ahead COVID-19 healthcare forecasts. These analyses led to two public dashboards, one tracking daily COVID-19 admissions from all area hospitals (46) and another providing COVID-19 healthcare forecasts (47). Both have been maintained since the spring of 2020 and continue to guide risk awareness, mitigation policies, and healthcare resource allocations in the fastest-growing large city in the United States, with a metropolitan area population approaching 2.3 million.

Results
A visual comparison of COVID-19 case counts, hospital admissions, hospital census, ICU census, and death counts in the Austin-Round Rock metropolitan statistical area (MSA) from March 13, 2020 through February 28, 2021 reveals persistent lags and different degrees of variability (Fig. 1A). Deaths tend to lag the other variables by several weeks; the three healthcare variables--hospital admissions, hospital census (which includes general and ICU patients), and ICU census--are smoother than case counts, with multiweek hospital stays causing the hospital census and ICU census to decline more slowly following peaks. Assuming that the goal of surveillance is to anticipate COVID-19 healthcare demand, we evaluate all variables in terms of the timing and strength of their correlation with COVID-19 healthcare usage indicators such as hospital and ICU census (Fig. 1B) Given the advantages of COVID-19 hospital admission data over the alternative indicators, we propose a forecasting model that uses admissions counts in combination with cell phone GPS data to estimate local transmission rates and project imminent healthcare surges (SI Appendix, Fig. S1). Specifically, we use particle filtering to fit an age-and risk-structured susceptibleexposed-infected-recovered (SEIR) model to daily reported COVID- 19  in exposure rates stemming from changes in policy and behavior, we assume that transmission rates depend on population mobility and simultaneously estimate time-dependent regression coefficients governing that relationship (Fig. 2). The model yields COVID-19 hospital admissions estimates that mirror the observed data in the Austin MSA from March 13, 2020 through February 28, 2021 ( Fig. 2A). We observe similar fidelity with respect to COVID-19 hospital census, ICU usage, discharge, and in-hospital mortality during the same time period (SI Appendix, Fig. S3 Fig. S4). Following the citywide closure of schools on March 13 and Stay Home-Work Safe order on March 24, 2020 (52, 53), the estimated reproduction number dropped to a temporary low of 0.91 (95% CrI: 0.65 to 1.3) on April 6 ( Fig. 2C). Although the reproduction number remained relatively flat through late April, the upper bound of the 95% CrI never fell below one. Following the White House's Opening Up America Again guidelines, Texas reopened in phases starting May 1, 2020 (54)(55)(56). Within weeks, the estimated SARS-CoV-2 transmission began to increase, reaching a peak of 1.7 (95% CrI: 1.3 to 2.0) on June 6. To curb rising hospitalizations, the City of Austin enacted a mask order and limited gathering sizes on June 15 (57). Statewide, Texas closed bars on June 26 and enacted mask orders and gathering limits on July 3 (58,59). The pandemic then slowed rapidly to the minimum detected Rt of 0.65 (95% CrI: 0.52 to 0.77) on July 19. Between mid-August and mid-October, the University of Texas opened, with an estimated 30,000 students in Austin participating in hybrid instruction (60); Austin Independent School District, with an enrollment of over 80,000 students, returned to optional in-person instruction (61); and bars were reopened statewide (62). During this period, the reproduction number steadily increased to a high of 1.3 (95% CrI: 1.0 to 1.5) on October 31 and likely remained at or above 1.0 until January 18, 2021, producing an alarming winter surge.
Since May 2020, the city has maintained a public-facing dashboard (46) that tracks the 7-d moving average of COVID-19 hospital admissions and provides clear threshold values for activating different alert levels, ranging from stage 1 (open) to stage 5 (lockdown) (63). According to these triggers, the city enacted stage 5 between June 26, 2020 and July 26, 2020 to mitigate the summer surge, and between December 23, 2020 and February 9, 2021 to mitigate the winter surge, with the COVID-19 ICU census peaking on January 12, 2021 at 190, just short of the estimated local capacity of 200 patients. Austin opened an alternative care site in a large convention center on January 9 and triggered the state's GA-32 order which restricted restaurant capacity and elective surgeries on January 10, after COVID-19 patients exceeded 15% of all hospitalized patients in the region for seven consecutive days (64)(65)(66). The estimated reproduction number declined throughout the stage 5 period, reaching a minimum of 0.65 (95% CrI: 0.5 to 0.9) on February 2.
Population mobility, as measured by the proportion of the day spent at home and numbers of visits to public points of interest, declined sharply during the spring 2020 shelter-in-place order, and then exhibited fluctuations that tracked local COVID-19 policies and epidemiological trends (Fig. 2B). After reducing the dimensionality of eight mobility variables via a principal components analysis, we find that the first principal component clearly reflects known holidays and other anomalous periods, including Thanksgiving, Christmas, and the catastrophic Texas winter storm of February 2021 which forced many residents to shelter in place (67). The academic calendars of the local K-12 school districts and the University of Texas at Austin are reflected in the changing frequency of visits to campuses but have little impact on the overall mobility trends reflected in the principal components analysis (SI Appendix, Fig. S6). Fluctuations in bar and restaurant visits likewise mirror changing COVID-19 restrictions.
When a community adopts precautionary measures that reduce transmission risks in public venues--like face masking, keeping physical distance, and proactive testing--the relationship between mobility and transmission may weaken; the same level of mobility may correspond to a lower level of transmission. When communities loosen such measures, the reverse may occur. We indirectly estimate changes in such precautionary behavior by simulating a counterfactual scenario in which the relationship between mobility and transmission is fixed at the level estimated from the 4 wk beginning on March 13, 2020, the day of the first reported hospital admission. By comparing the resulting hypothetical transmission rates to those originally observed, we estimate the changing relationship between mobility and transmission ( Fig. 2D). We estimate that, on February 14, 2021, mobilityassociated transmission was reduced by 62% (95% CrI: 52 to 68%) relative to early 2020.
We estimate that 15.9% (95% CrI: 15.6 to 16.4%) of the population had been infected by the end of February 2021, and validate these results with CDC seroprevalence estimates ( Fig. 2E) (48). The estimated prevalence of SARS-CoV-2, including asymptomatic infections, peaked at 0.8% (0.7 to 0.9%) in early January 2021 (SI Appendix, Fig. S9). We estimate the timevarying case detection rate by comparing predicted infections to observed case counts. The rate ranged from just under 25% in March 2020 to a peak of 70% in December 2020 (8) (Fig. 2F). On February 1, 2021, the city reported almost 6,000 previously unreported cases dating back several months; 2 wk later, reporting was largely suspended as a historic freeze brought the city to a halt (50,51).
Since May 29, 2020, we have used this model on a daily basis to provide 3-wk-ahead projections of COVID-19 healthcare demand on a dashboard that is widely used by local policy makers, healthcare systems, press, and the public (47). In retrospective validation, we find that 92.9%, 89.5%, and 87.9% of reported daily COVID-19 hospital census values fall within the 95% prediction intervals of our 1-wk-, 2-wk-, and 3-wk-out projections, respectively ( Fig. 3). For COVID-19 ICU data, the corresponding performance metrics are 89.7%, 88.1%, and 87.0%. Our models tend to overproject COVID-19 healthcare demand, particularly at pandemic peaks (Fig. 3, black tick marks). During the summer and winter peaks, the forecasts indicated that the city might exhaust local ICU capacity but not hospital general bed capacity.
We compare the forecasting performance of our model to three alternative models-a simple random walk (68), an automated autoregressive integrated moving average (ARIMA) model (68), and a simple version of our model which omits the mobility covariate (Fig. 4). The proportion of observed data points that fall within the 95% prediction intervals is highest for the nonmobility version of our model, across the 1-wk, 2-wk, and 3-wk forecasting horizons (SI Appendix, Fig. S10). Our full model performs on par with the ensemble model from the CDC's national COVID-19 healthcare forecasting hub (69) and outperforms the simpler random walk and ARIMA models (SI Appendix, Fig. S10A). The four models achieve comparable levels of error in their (median) point estimates (SI Appendix, Fig. S10B). However, these summary statistics do not reflect time-dependent performance differences among the models. Our full model offers the highest precision and accuracy during pandemic surges ( Fig. 4 and SI Appendix, Figs. S11 and S12). Although the two simple statistical models offer highly accurate (and precise) forecasts during periods of relative stability, they fail to predict exponential growth and rapid decline. Our model outperforms the nonmobility version in reducing uncertainty-providing narrower prediction intervals-particularly at critical epidemic change points.

Discussion
Through a unique collaboration between policy makers, public health officials, healthcare systems, and scientists in the Austin-Round Rock metropolitan area, we developed a flexible model for pandemic surveillance and healthcare forecasting that has guided local COVID-19 responses for over a year. Daily projections have contributed to key pandemic decisions, including enacting the initial Stay Home-Work Safe order (52), face mask mandates (57), and the launch of an alternate care facility to accommodate healthcare overflow (66). Throughout the pandemic, city leadership and local news organizations have regularly cited our model outputs to communicate risks and explain policy changes to the public (70)(71)(72).
Although early COVID-19 risk assessments and forecasts relied almost exclusively on COVID-19 case and mortality data, we find that COVID-19 hospital admissions provide a more accurate and timely indication of recent transmission and imminent healthcare usage. Given the average 5.2 d between infection and symptom onset and average 5.9 d from symptom onset to hospital admission, we expect hospital admissions data to lag infection by roughly 11 d to 12 d, although there is significant individual variation in the time course of infection (73,74). Case counts could provide a more immediate signal of incidence, if cases seek testing and receive rapid results immediately after or even before symptom onset. However, testing in the United States has been plagued by biases and delays throughout the pandemic, including restricted access (75,76), public health guidance to wait until after symptom onset (77), and chronic lags in laboratory processing and reporting (77,78). We expect case data to exhibit 11-to 12-d lags similar to hospital admissions data, given the sequence of delays from infection to symptom onset to test seeking to receipt of test results. A national survey in September 2020 suggested that cases seek tests an average of 2.5 d after first symptoms and wait an average of 3.7 d to receive results (78). Moreover, case count data have persistently exhibited racial, ethnic, and geographic biases due to differential testing access and availability (27). Thus, hospital admissions provide an equally lagged but potentially less biased signal of recent transmission than case data. Despite the utility of COVID-19 hospital admission counts, such data were not widely available in the United States until 9 mo into the pandemic (26). Part of the challenge is that COVID-19 status is not always known at the time of admission, particularly early in the pandemic, when diagnostic resources were limited (75). In Austin, hospitals occasionally updated admissions counts retroactively when SARS-CoV-2 confirmations were delayed.
We estimate that, early in the pandemic, the SARS-CoV-2 reproduction number (Rt ) reached 5.8 (95% CrI: 3.6 to 7.9). Although high, it is consistent with previously published estimates (79). Similar estimates in other cities have been attributed to superspreading events, which we do not explicitly model (80). We note that our estimate is sensitive to the timing of COVID-19 emergence in Austin. If we assume that the initial case arrived on January 20, 2020 rather than February 19, 2020 (which is based on the timing of the first COVID-19 hospital admission), then we estimate a maximum R t of 4.5 (95% CrI: 3.0 to 6.4). However, the estimates quickly converge after March 13, 2020, when COVID-19 healthcare data become available (SI Appendix, Fig. S5).
We estimate that the case detection rate has been highly variable, ranging from less than 20% of cases reported at the outset to well over half reported since early 2021. This variation likely reflects evolving testing priorities, technologies, and access, as well as changes in test seeking behavior driven by fear and effective public health communications (81). However, these citywide averages do not capture demographic and geographic heterogeneity in testing behavior (27,81). For example, children are much less likely to develop symptoms and seek testing than adults, although some private schools have mandated weekly or more frequent testing of all students and staff. The University of Texas at Austin population is similarly overrepresented in the citywide testing data, with their proactive testing program screening an average of 340 students and faculty per day during the 2020-2021 academic year (82).
Our retrospective estimates of COVID-19 infections in Austin are consistent with seroprevalence data (48). Just prior to the summer 2021 emergence of the Delta variant in Austin, we estimated that just under 20% of the Austin-area population had been infected and 58% of adults over age 16 y had received at least one dose of a SARS-CoV-2 vaccine (83,84). As vaccine uptake counterbalances increased transmissibility of COVID-19 variants, our model can be used to continually monitor local transmission dynamics. Going forward, forecasting models like ours must integrate the dynamics of infection-acquired and vaccine-acquired immunity against wild-type and variant SARS-CoV-2 viruses.
Our forecasting model performs well in comparison to simpler mechanistic and nonmechanistic statistical models. Although the four models considered achieve comparable coarse-grained performance statistics, our mobility-driven mechanistic model provides the best combination of accuracy and precision surrounding pandemic surges, when reliable forecasts are particularly important for effective healthcare provisioning, public health responses, and general risk awareness. Removing the mobility covariate from our model significantly increases forecasting uncertainty. Although this increases coveragethe proportion of observed values falling within prediction intervals-it significantly reduces the informativeness and public health utility of the forecasts. Since May 2020, our model projections have informed numerous time-sensitive policy decisions and response actions, including resource planning by local hospitals, urgent requests to state and federal agencies for additional surge resources, the launch and dismantling of alternative care sites to provide additional healthcare capacity, and numerous changes in the Austin-area COVID-19 alert stage to communicate and manage rising and declining risks (43).
In March 2020, we faced an unexpected technical challenge. Prior to the COVID-19 pandemic, most models of respiratory virus transmission assumed that daily contact patterns would be fairly stable. The simplest models assumed that populations are entirely homogeneous and well mixed, others incorporated age-specific contact patterns from diary-based surveys (85) or inferred from epidemiological data (86), and still others assumed complex networks of interactions based on sociological data sources (87,88). The nationwide shelter-in-place orders broke these assumptions. The cell phone mobility data provided by SafeGraph and other technology companies provided an immediate and valuable window into changing behavioral patterns (19). Early in the pandemic, cell phone GPS data reflected COVID-19 policies and correlated with transmission rates (18,89). Our model comparison-with and without mobility datafurther suggests that mobility data can provide an immediate and reliable indication of changing risk behavior. However, the relationship between mobility and transmission can evolve as communities adopt and relax precautionary behavior. To capture this, we estimated a coefficient that relates daily mobility to daily transmission rates in Austin. The data suggest that mobilityassociated risks of transmission initially declined in the spring of 2020, then spiked following the White House's Opening Up America Again campaign, and slowly increased between August and the end of 2020. As novel sources of behavioral information become available, such as more granular mobility trends (90), Bluetooth-enabled contact tracing records (91), or self-reported face covering usage (22), we should carefully consider and (if possible) explicitly model the observational processes used to collect the data and the behavior dynamics that shape them.
Our retrospective analysis of the Austin experience provides anecdotes regarding the impact of COVID-19 policies on risks. Notably, the statewide reopening in May 2020 appeared to fuel the major summer wave. The constellation of policy relaxation, behavioral fatigue, return to school, and winter holidays preceded the winter surge. Recent studies have quantified the impact of restaurant and bar restrictions, school closures, and mask mandates on local SARS-CoV-2 transmission (92-94). Our study of the COVID-19 pandemic in Austin does not disentangle the relative impacts of such measures but provides an intuitive case study for the dynamic interplay between public policy, human behavior, and viral transmission.
Throughout the pandemic, we have applied this model to provide estimates of key COVID-19 indicators and month-ahead hospitalization forecasts. In April 2020, we started by providing model-based projections at the city task force meetings multiple times per week. By June 2020, we had automated the data processing and statistical fitting procedures and launched a public-facing dashboard (47). The choice of indicators and plotting formats were honed through months of engagements with city leadership and local media. The Austin-Round Rock MSA dashboard provides the daily reproduction number with 95% CrIs, the probability that the pandemic is in a growth phase (that is, the probability that the reproduction number is above one), and the 14-d change in incidence as a percent (SI Appendix, Figs. S14 and S15). It also includes time-series graphs for COVID-19 hospital admissions, hospital census, and ICU census, each of which displays data from the beginning of the pandemic and spaghetti plot forecasts, which convey uncertainty by depicting 100 distinct stochastic projections. This visually communicates that qualitatively different futures may be equally likely, and emphasizes the considerable uncertainty we have faced throughout the pandemic stemming from data quality issues and our inability to anticipate changes in behavior and government policies. Our retrospective performance evaluation revealed that the 95% prediction intervals do not capture 95% of the future data. Specifically, the model failed to predict the rapid deceleration of transmission leading to the peaks observed in July and January. One possible explanation is unmodeled feedback from the system (Austin) to the model, as suggested in prior COVID-19 forecasting studies (95). As COVID-19 hospitalizations climbed, city leadership enacted stricter policies and aggressively communicated the pessimistic forecasts to the public to encourage precautionary behavior and curb transmission. Indeed, our largest prediction errors are clustered around the two pandemic peaks, shortly after Austin transitioned to its most restrictive COVID-19 alert stage. The model does not directly or immediately capture such policy and behavioral changes but rather estimates their effects, with delay, from mobility and hospitalization data. Our COVID-19 forecasting successes and failures will likely inspire a new generation of epidemiological models that include mechanistic behavioral dynamics, organizational decision-making, and feedback between sociological and epidemiological dynamics. Through discussions with the city's COVID-19 task force, media outlets in central Texas, local school districts and universities, major hospital systems, and community organizations, we believe that the dashboard has served as a trusted, daily touchstone for the leadership and residents of Austin, TX. For example, the modeling informed decisions to enact the city's Stay Home-Work Safe order in March 2020, the design of the staged alert system that has guided policy since May 2020 (43), the provisioning of hotel rooms as isolation facilities for populations experiencing homelessness and university students living in congregant settings (96), and the launch of an alternative care site at the convention center to accommodate healthcare overflow, as well as reopening policies by universities and schools throughout the city (97). Arguably, the primary value of this effort has been providing a common, predictive understanding of the changing risks, even when the forecasts have been imperfect.
We note three key limitations of our model. First, we do not consider superspreading events, which could lead our model to underestimate future risks, particularly if a superspreading event occurs in a long-term care facility (98). Our model likely captures the potential for sudden transmission rate changes from superspreading events; however, mechanistically incorporating such dynamics could increase the precision of our projections. Second, we assume that Austin is a well-mixed population, and thus ignore important heterogeneities such as long-term care facilities (99) and the extreme east-west segregation of the city, with majority-Latino communities experiencing much higher rates of infection and severe outcomes than the majority-White communities (100-104). Incorporating such heterogeneity for Austin and carefully adapting such assumptions to other cities could substantially improve projections and inform more strategically targeted mitigation efforts. Finally, our estimates for SARS-CoV-2 incidence are sensitive to the assumed infection hospitalization rates, which vary across age and health subgroups and remain uncertain (37,105). Incorporating uncertainty in these parameters would yield wider and, arguably, more reasonable credibility intervals around our estimates for SARS-CoV-2 incidence and case reporting rates. As better data become available, through serological surveys and prospective studies, these parameters can be readily updated.
Immediate, reliable, and comprehensive access to SARS-CoV-2 hospitalization, vaccination, and molecular surveillance data--all of which are collected in electronic databases throughout the United States--is critical for real-time risk assessments, reliable forecasting, and, most important, effective decision-making by individuals, organizations, and government agencies. Translating such data into interpretable indicators and accessible graphs can improve coordination among stakeholders and encourage public buy-in. Our model is designed to provide such retrospective insight and actionable guidance for the public and policy makers in communities throughout the United States.

Materials and Methods
Epidemiological Model. We use an age-and risk-structured SEIR model that incorporates asymptomatic and symptomatic transmission, hospitalization, and mortality. The demographic and risk structure are based on estimates for the Austin-Round Rock MSA (SI Appendix, Fig. S2 and Tables S4-S6), and the natural history of SARS-CoV-2 follows published estimates (SI Appendix, Tables S1-S3). Transmission rates are driven by regional mobility, and the governing relationship between mobility and transmission is allowed to change daily to reflect the dynamic impacts of policy and behavior. The hospital stay duration is also allowed to vary as standards of care and healthcare strain impact the COVID-19 hospital experience (106,107).
The model structure is diagrammed in SI Appendix, Fig. S1, and we present the stochastic formulation in the equations below. For each age and risk group, we build a separate set of compartments to model the transitions between the states: susceptible (S), exposed (E), presymptomatic infectious (P Y ), preasymptomatic infectious (P A ), symptomatic infectious (I Y ), asymptomatic infectious (I A ), symptomatic infectious that are hospitalized (I H ), recovered (R), and deceased (D). The symbols S, E, P Y , P A , I Y , I A , I H , R, and D denote the number of people in that state in the given age/risk group, and the total size of the age/risk group is Transitions between compartments are governed using the tau-leap method (108,109) with key parameters given in SI Appendix, Tables S1-S3. The stochastic model for individuals in age group and risk group is given by

Fox et al.
Real-time pandemic surveillance using hospital admissions and mobility data PNAS 7 of 11 https://doi.org/10.1073/pnas.2111870119 where B(n, p) denotes a binomial distribution with n trials each with probability of success p; γ A , γ Y , and γ H (t) are the recovery rates for the I A , I Y , and I H compartments, respectively; σ is the exposed rate; ρ A and ρ Y are the pre(a)symptomatic rates; τ is the symptomatic ratio; π is the proportion of symptomatic individuals requiring hospitalization; η is the rate at which hospitalized cases enter the hospital following symptom onset; ν is mortality rate for hospitalized cases; and μ(t) is the daily instantaneous rate at which terminal patients die. Fa,r denotes the force of infection for individuals in age group a and risk group r and is given by where A and K describe the age and risk groups, respectively; ω A , ω Y , ω PA , and ω PY are the relative infectiousness of the I A , I Y , I PA , and I PY compartments, respectively; and φ a,i is the mixing rate between age group a and age groups i ∈ A. We define the time-dependent transmission rate β(t) as a function of mobility as and PC2 describe the first and second principal components from our mobility data as described below, ψ = 0.97, and N(μ N , σ N ) denotes a normal distribution with mean of μ N and SD of σ N . Finally, we allow the duration in the hospital for individuals who survive, γ H (t), and those who pass away, μ(t), to vary in time as where Zμ(0) = 0, Zγ(0) = 0, and ψγ = 0.99. To run the SEIR model without mobility, we set PC1(t) = 0 and PC2(t) = 0 for all t, so β(t) = β(0) · e Z(t) .
Mobility Trends. We used mobility trends data from the Austin MSA to inform the transmission rate in our model. Specifically, we ran a principal component analysis (PCA) on eight independent mobility variables provided by SafeGraph (19), including 1) home dwell time and visits to 2) universities, 3) bars, 4) grocery stores, 5) museums and parks, 6) medical facilities, 7) schools, and 8) restaurants. All metrics are provided at the census block group (CBG) and aggregated to the five county metropolitan regions (Bastrop, Caldwell, Hays, Travis, and Williamson Counties). For each CBG, SafeGraph provides the daily average home dwell time and number of reporting devices. We estimate average home dwell time in the MSA by averaging across CBGs weighted by the number of reporting devices. For all other visitation metrics, we sum the total visits for the specific indicator across all CBGs within the MSA. We baseline each metric according to prepandemic mobility by calculating the average value for the metric in the MSA from January and February of 2020 and dividing all subsequent values of that metric by the prepandemic baseline. We carry out a PCA on the eight baselined metrics using all data up to the day the projections are made, which captures almost as much variation in mobility as a more granular sliding window PCA (SI Appendix, Fig. S7). We use the first two principal components as covariates for a regression as described in the modeling equations for β(t). Daily 7-d averages for the raw mobility data can be seen in SI Appendix, Fig. S6.
Model Fitting. We obtained daily hospital admit, discharge, census, and death data for the Austin MSA from Austin Public Health. We assumed all sources of data were negative binomially distributed around their predicted values from the SEIR stochastic model with dispersion parameter k. We chose informative but relatively dispersed priors for certain parameters for stability in parameter estimation and to prevent the model from overfitting data through large perturbations to time-dependent variables. A full explanation of the likelihood for the model can be found in SI Appendix. We estimated ψμ, σμ, and σγ and fixed the remaining parameters as described in SI Appendix, Tables S1-S3. Fitting was carried out using the iterated filtering algorithm made available through the mif2 function in the pomp package in R (110)(111)(112). This algorithm is a stochastic optimization procedure; it performs maximum likelihood estimation using a particle filter to provide a noisy estimate of the likelihood for a given combination of the parameters. For each parameter combination, we ran 300 iterations of iterated filtering with a cooling fraction of 50% every 60 steps, each with 3,500 particles. This iterated filtering was run 50 times, and the maximum likelihood estimate (MLE) among these 50 was selected. We calculated smoothed posterior estimates for all of the states within the model through time (including β(t) and other time-dependent parameters which are technically state variables in our model formulation). We estimated these smoothed posteriors as follows: 1) We ran 1,000 independent particle filters at the MLE, each with 2,500 particles. For each run, l, of particle filtering, we kept track of the complete trajectory of each particle, as well as the filtered estimate of the likelihood, L l . 2) For each of the 1,000 particle filtering runs, we randomly sampled a single complete particle trajectory, giving us 1,000 separate trajectories for all state variables. 3) We resampled 1,000 trajectories from these 1,000 trajectories with probabilities proportional to L l to give a distribution of state trajectories.
The result can be thought of as an empirical Bayes posterior distribution; that is, a set of 1,000 smoothed posterior draws from all state variables, conditional on the MLEs for the model's free parameters. This smoothed posterior distribution is how we calculate summary statistics for our timevarying state variables. Our estimates for β(t) are converted to R(t) estimates as described below, and model estimates for the instantaneous discharge rates for surviving (γ H (t)) and dying (μ(t)) patients can be found in SI Appendix, Fig. S13.
Making Projections. Our model fitting procedure provides MLE for all of the key parameters (e.g., the SD governing the random walk of the transmission rate) in the model alongside smoothed posteriors for the state variables (e.g., the number of individuals in each compartment of the model or the daily transmission rate). We sample from the smoothed posterior distribution to obtain a distribution of initial state conditions for the projections. We initialize 1,000 projections with those initial state conditions and run the stochastic model forward according to the MLE of the fixed parameters. In this way, we capture two sources of uncertainty in our parameter estimates: 1) uncertainty in the underlying state of the community at the time the projections are made and 2) uncertainty in how behavior might change in the future as captured by the random walk function in our transmission rate.
Projection Model Comparison. We compare projections from the SEIR epidemiological model with projections from statistical null models provided by the forecast package in R (68). For the random walk model, we use an ARIMA model of order (p = 0,d = 1,q = 0) (68), and we use the Hyndman-Khandakar algorithm for automatically determining the order of an ARIMA model for the Auto ARIMA model (68). We fit the models to all available data up to the date the projection is made, and project forward with the fitted model.

Time-varying reproduction number (Rt).
To estimate the time-varying reproduction number (Rt), we apply the next-generation method to our daily estimated smoothed posterior distributions for β(t) with the MLE values of the estimated parameters and the fixed parameters listed in SI Appendix, Tables S1-S3 (113). Reporting rates. We estimate the reporting rates by comparing our estimates for daily incidence with daily reported cases counts for the Austin MSA (Bastrop, Caldwell, Hays, Travis, and Williamson Counties) as provided by The New York Times (8). To roughly estimate changing reporting rates, we lag the case data by 11 d to account for the lag between infection and case reporting (73,78). In estimating the maximum and minimum reporting rates, we exclude case data for February 2021, because reporting was impacted by and mobility data a severe weeklong winter freeze and the reporting of a large number of backlogged cases (49)(50)(51). Estimating Austin COVID-19 seroprevalence. COVID-19 seroprevalence estimates are not available for the Austin metropolitan region, but the CDC has conducted biweekly Texas seroprevalence estimates since the summer of 2020 (48). We adjust the Texas seroprevalence estimates to account for the heterogeneous burden of the pandemic across the state. Specifically, we assume that Austin seroprevalence can be estimated as where Itexas is the seroprevalence estimate provided by the CDC for the state of Texas, and D indicates the per capita mortality rate for Austin or the state of Texas as provided by The New York Times (8). As carried out in ref. 48, we shift all time-dependent estimates to their corresponding date of infection, so seroprevalence estimates are shifted to 7 d before the first sampling day to account for the time it takes to become seropositive following infection, and mortality date are shifted 20 d to account for the delay between infection and mortality (114). We then compare the corrected estimate for I austin (t) with the daily cumulative estimated infections from the model.

Estimating the time-varying relationship between mobility and transmission.
Our model estimates the time-varying transmission rate as with b 1 (t) and b 2 (t) governing the relationship between the mobility data and the transmission rate. Since transmission is governed by a combination of b 1 (t), b 2 (t), and Z(t), an increase in one may be compensated by a decrease in another without significantly changing the overall transmission rate. Thus, we cannot easily estimate the contribution of each in isolation. Instead, we estimate the time-varying relationship between mobility and transmission through a comparison between our fitted model and a counterfactual scenario where b 1 (t), b 2 (t), and Z(t) are fixed at their average initial estimated values (b 1 ,b 2 , andZ). Specifically, we estimateb 1 ,b 2 , andZ as the average for the respective parameters over the first 4 wk of hospitalization data from the fitted model (from March 13, 2020 to April 10, 2020), and calculate the expected transmission rate based on this initial relationship and subsequent mobility data as β (t) = β(0) · eb 1 ·PC1(t)+b 2 ·PC2(t)+Z .
β (t) can be thought of as the counterfactual transmission rate if the initial relationship between mobility and transmission remained constant over the course of pandemic. We estimate the reduction in mobility-driven transmission that is unexplained by mobility levels as We provide a point estimate for the overall reduction in mobilitytransmission risk on February 14, 2021 relative to early in the pandemic, and provide a sensitivity analysis with respect to the start and duration of the baseline period (SI Appendix, Fig. S8).
Data Availability. All code and healthcare time-series data used in this study are publicly available and have been deposited in GitHub (https://github.com/UT-Covid/SEIR-Austin).