Seasonality and uncertainty in COVID-19 growth rates

The virus causing COVID-19 has spread rapidly worldwide and threatens millions of lives. It remains unknown if summer weather will reduce its continued spread, thereby alleviating strains on hospitals and providing time for vaccine development. Early insights from laboratory studies of related coronaviruses predicted that COVID-19 would decline at higher temperatures, humidity, and ultraviolet light. Using current, fine-scaled weather data and global reports of infection we developed a model that explained 36% of variation in early growth rates before intervention, with 17% based on weather or demography and 19% based on country-specific effects. We found that ultraviolet light was most strongly associated with lower COVID-19 growth rates. Projections suggest that, in the absence of intervention, COVID-19 will decrease temporarily during summer, rebound by autumn, and peak next winter. However, uncertainty remains high and the probability of a weekly doubling rate remained >20% throughout the summer in the absence of control. Consequently, aggressive policy interventions will likely be needed in spite of seasonal trends.

uncertainty remains high and the probability of a weekly doubling rate remained >20% 23 throughout the summer in the absence of control. Consequently, aggressive policy 24 interventions will likely be needed in spite of seasonal trends. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020. 04.19.20071951 doi: medRxiv preprint However, these analyses relied on the early stages of viral spread before the epidemic had 48 reached warmer regions and thus potentially conflated weather with initial emergence and global 49 transport.

50
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. We estimate how weather affects COVID-19 growth rate using data through April 13 th , 51 2020 by applying methods that improve model predictive accuracy, incorporate uncertainty, and 52 reduce biases. Based on emerging evidence, we developed several a priori predictions about how 53 weather, either directly or indirectly via modified human behaviors (e.g., aggregating indoors), 54 affects COVID-19 growth rate. Preliminary research on SARS-Cov-2 (9, 10, 12) and related 55 viruses (8, 13) predicted that COVID-19 growth would peak at low or intermediate temperatures. 56 However, other coronaviruses demonstrate weak temperature dependence, instead depending on 57 social or travel dynamics (7). High humidity also might decrease viral survival, limit 58 transmission of expelled viral particles, or decrease host resistance (13-15). Ultraviolet light 59 effectively kills viruses such as SARS-Cov-1 (16), and thus sunny days might decrease outdoor 60 transmission or promote immune resistance via vitamin D production (17). We also evaluate 61 demographic variables, assuming greater transmission in denser and older (>60) populations. 62 We modeled maximum growth rate of COVID-19 cases to estimate contributions from 63 underlying climate and population dependencies without healthcare interventions (e.g., social 64 distancing). Hence, we restrict analyses to the early growth phase before interventions reduced 65 transmission, but after community transmission began, when the vast majority of the population 66 was still susceptible to this novel virus. We estimated the average maximum growth rate (λ) as 67 the exponential increase of cases (ln(Nt) -ln(N0)/t, where Nt = cases at time, t, and N0 = initial 68 cases) for the three worst weeks in each political unit (country or state/province depending on 69 available data (3)), where t = 7 days (see Supplementary materials for additional periods). 70 Testing and reporting of COVID-19 likely vary across political units. However, estimated 71 growth rates should remain robust to these biases assuming detection probabilities remain 72 constant during the short, one-week estimation period. We restricted analyses to locations with 73 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. >40 cases to eliminate periods before local community transmission. Applying these criteria, we 74 used data from 128 countries and 98 states or provinces. 75 We applied Bayesian Markov Chain Monte Carlo methods with uninformative priors to 76 estimate parameters. We obtained daily infection data from (3) and 3-hour weather data from the 77 ERA5 reanalysis for the 14 days preceding case counts consistent with the 1-14 day infective 78 period (18). We used fine-scaled weather data rather than long-term climatic monthly means to 79 model observed weather-outbreak dynamics. Weather data was weighted by population size in 80 each 0.25° grid cell within each political unit to capture the weather most closely associated with 81 outbreaks in population centers. We used leave-one-out cross-validation to choose the best 82 models, which ranks model on predictive accuracy on excluded data. We included a random 83 country effect to account for differences in national control response times, health care capacity, 84 testing rates, and other characteristics intrinsic to country of origin.

85
The best model for predicting maximum COVID-19 growth rate predicted 36% of the 86 variation in COVID-19 growth rates (Fig. 1), and 17% excluding country effects. This model 87 included maximum daily ultraviolet light, mean daily temperature, proportion elderly, and mean 88 daily relative humidity ( Fig. 2A were not an artifact of this correlation; see Supplementary Materials). As expected, relative 96 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. survival outside humans or reducing airborne transmission (βhumid = -0.05, 95% CIs: -0.11, 0.00).

98
Absolute humidity was strongly correlated with temperature (r = 0.88) and thus could be 99 exchanged with temperature with little difference in model performance. Contrary to predictions, 100 the proportion of elderly decreased COVID-19 growth rate (βpopsize = -0.07, 95% CIs: -0.14, -101 0.00), most likely due to outbreaks in developed countries with older populations. Population   CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. pathogens before they reach an equilibrium distribution with climate. Initial climate associations 120 with viral outbreaks will first correlate with the narrower range of climatic variation found at the 121 emergence site and then in global transportation hubs, rather than reflecting ultimate biological 122 limits on growth and survival. We recognize that future data could alter our predictions further, 123 especially as COVID-19 becomes endemic (15). However, less variable model predictions and 124 exposure to the most common global climates by April (Fig. 3) suggest that model predictions  Using our model, we predicted potential COVID-19 growth rates in the upcoming 127 months relative to a weekly doubling rate (λ=0.1; Fig. 4). Based mostly due to variation in UV 128 and temperature, our model predicts that COVID-19 risk will decline across the northern 129 hemisphere this summer, remain active in the tropics, and increase in the southern hemisphere as  The overall conclusion is that although COVID-19 might decrease temporarily during summer, 139 there is still a moderate probability that it is weakly affected by summer weather, and that it 140 could return in autumn and pose increasing risks by winter.

141
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. Our predictions were robust to the manifold decisions made regarding data and model 142 structure. We explored the consequences of using different parameter comparisons, the effects of 143 shorter (7-day) time windows for aggregating weather data, different cut-offs for minimum 144 number of cases, varying number of weeks analyzed per political unit, whether we analyze the 145 first or worst weeks following the infection threshold, and if we included weather maxima and 146 minima instead of means and found no qualitative changes to results, with the exception that 147 maximum daily UV during at 14-day interval substantially outperformed the mean 148 (Supplementary materials). We also explored the effect of excluding data from China, which 149 lacked data prior to control measures in many cases, and found similar results.  161 We demonstrate that COVID-19 growth rate increases with reduced ultraviolet light, 162 higher temperatures, and lower relative humidity. We predict that COVID-19 will oscillate 163 between the northern and southern hemisphere, based largely on seasonal variation in UV 164 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. radiation and temperature without continuing interventions like social distancing. Despite a 165 possible, but highly uncertain, temporary summer reprieve in the north, COVID-19 is more 166 likely to return by autumn and threaten further outbreaks. The north should take this time to 167 build resilience against future outbreaks, while assisting countries in the tropics and southern 168 hemisphere. Uncertainty remains high, however, so we urge caution when making decisions such 169 as removing societal interventions before more permanent pharmaceutical solutions can be 170 implemented. Overcoming this pandemic will take extensive global collaborative scientific 171 efforts to unravel its biology as well as the continuing resolve of people worldwide adhering to 172 social restrictions.

173
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020.       197 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. University for providing COVID-19 data.

220
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020.       is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020.  . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020.

274
Overview 275 We examined the weekly rate of increase in the number of COVID-19 infections as a function of

302
To capture periods when the spread rate was most severe, we chose to focus on the worst three 303 (also two, four; see sensitivity analysis) weeks in each political unit based on the magnitude of 304 lambda, for our model. We were primarily concerned about high rates of spread, and their 305 possible drivers, so this decision controls for differences among polities in the onset of severe 306 spread and differences in the timing of control measures that may reduce growth. Hence, a focus 307 on maximum growth rates is the best, unbiased estimate of COVID-19 growth in the absence of 308 control measures, and most likely to be influenced by weather. In sensitivity analyses, we also 309 considered using the first 2,3, or 4 weeks following t0, and found similar, but more noisy results, 310 owing to the likely variation among countries in the early rates of spread (e.g., in Thailand, 311 growth was initially low before increasing rapidly).

313
Weather data 314 Weather data was aggregated from 3-hourly data downloaded from the ERA5 model by CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. Based on existing insights about SARS-CoV-2 and the onset of COVID-19, we considered the 326 following weather variables: temperature 2m above land surface, relative humidity, absolute 327 humidity, and total incoming UV radiation at the land surface. To align the weather data with 328 infection data for a given political unit, we determined the first day (t0) when more than 40

338
Finally, note that we also explored the use of minimum and maximum values of weather 339 variables to account for the possibility that transmission was more likely driven by extreme 340 weather rather than average weather. We also considered using weekly rather than biweekly 341 intervals to reflect the possibility of shorter incubation periods. Outcomes were robust to these 342 decisions.

344
Previous studies have noted that the coarse spatial grain of infection data (country or state level)

352
Population data 353 We obtained human population data from Worldpop.org focusing on total human population 354 (density) and proportion of the population over age 60. Population density was hypothesized to 355 control for the number of interactions individuals in a location were likely to experience whereas 356 the proportion of people over 60 in a polity was hypothesized to control for reporting rate, given 357 that older people are more adversely affected by the disease and thus more likely to be tested.

358
Data were obtained at 1km resolution and summed to the quarter degree grid imposed by the 359 weather data. Polity information was obtained based on global standards (GADM.com). Each 360 quarter degree grid cell was assigned to a polity and cells were averaged over the polity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. We focus on the growth rate of COVID-19 cases,, rather than estimating a climate niche for the 364 virus based on its presence or absence or total number of cases, as explored in preliminary 365 studies (11). to avoid issues with disequilibrium in the virus' distribution. We focused on 366 estimating the rate of increase () of infected individuals, rather than directly modeling the 367 number of infected individuals, in order minimize the influence of different reporting biases in 368 different polities. We calculated as= ( ln(N(t)) -ln(N(t0) )/t where t was taken to be 7 days and t0 369 defined the start date for counting infections. This formulation is independent of reporting bias 370 under the assumption that the reporting bias is constant over the 7-day interval. To see this, 371 consider that the true number of infected individuals N* is related to N via the proportion of 372 cases reported, p, such that N=pN*. Substituting this expression for N into the expression for , it 373 is apparent that p cancels out. Hence so long as p is approximately constant across a 7-day 374 interval, it does not affect the estimate of growth rate.

376
We modeled with a hierarchical Bayesian Gaussian regression with a log link on the weekly 377 transmission rate. The full model included mean 14 day lagged temperature, mean 14 day lagged 378 relative humidity, mean 14 day lagged absolute humidity, mean 14 day lagged UV, human 379 population density and proportion of the population over 60. We used linear terms for all 380 variables but also considered a quadratic term for temperature based on suggestions of modality 381 in previous studies (11)(12)(13). Based on sensitivity analyses discussed below, we found that 382 maximum daily UV was a considerably better predictor than the mean (delta LOOIC = 4.x) so 383 we used the maximum in our best model. Country-level random effects were used to capture 384 differences in policies, health care or other locally specific behaviors. We also explored 385 state/province-level random effects (where applicable), but country-level effects performed 386 considerably better in all models explored based on model selection criteria.

388
Model selection 389 We were interested in developing models with high predictive ability. Thus, we performed and is especially appropriate when the objective is prediction (14).

399
Model selection was performed by starting with the full model and using forward and backward 400 stepwise selection. The full model regressed the growth rate over a one-week window against 401 linear terms for mean temperature, mean UV, mean relative humidity, mean absolute humidity 402 population density, and proportion of the population over 60. We included a quadratic term for 403 temperature based on earlier studies suggesting a decline in of growth rate with temperature. We 404 also included an interaction term between temperature and UV to account for their correlation.

405
All these variables were calculated in the 7-day windows preceding the interval used to calculate 406 growth rate. During stepwise selection, we note that there were no cases of parameters trading 407 off with one another and that coefficients for each predictor always retained the same sign and 408 approximate magnitude regardless of which other predictors were in the model. The only 409 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. exception to this was when UV was excluded from a model that included temperature; the 410 temperature effect dropped from positive to near zero. Hence it is important to interpret the 411 positive effect of temperature in our best model as accounting for the effect of temperature only 412 after UV has been included in the models.

414
Once we found the best suite of predictors (excluding the quadratic temperature term, the UV-415 temperature interaction, absolute humidity, and population density), we explored whether using for UV was negative. In all cases, the 95% intervals for relative humidity and population density 431 always overlapped zero but the medians were always negative and positive, respectively. The 432 quadratic temperature term never improved the model, indicating that there was no support for a 433 unimodal response to temperature.

435
Sensitivity to a number of data preparation steps was assessed. During data preparation, we 436 considered the 2, 3, and 4 worst weeks (highest lambda) following t0, as well as the first 2,3, and (rather than the worst) 3 weeks following t0 to accumulate data as early as possible and thus 453 reflect decisions made in earlier studies, and (2) used polity (rather than country) effects because 454 the data in early February was predominantly from China and thus country effects could not be 455 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 22, 2020. 14, we expect that infections reported in the next two weeks were initiated before the policy 467 began. Hence, we predict the underlying contribution of weather to future COVID-19 growth.  Currently governments are deciding when and how to relax control measures, often under the 471 assumption that weather will lessen the potential for spread in the upcoming months. Thus, 472 whereas we do not presume to predict the actual future growth rate of COVID-19, we do hope to 473 capture the potential maximum growth rate in order to inform the relative risks of alternative 474 control strategies.

476
To make future projections, we obtained monthly mean temperature and relative humidity 477 weather data from 2015-2019 from the same data source as above, under the assumption that 478 these recent years are representative of what to expect in the coming months. Notably hotter or 479 cloudier (lower UV) days in the coming months would suggest higher growth rates than we 480 predict. UV data was not available in a monthly aggregation, so we obtained the 3-hourly data 481 and aggregated it to monthly values. Human population was assumed to remain constant. We 482 projected the models without random effects (or equivalently at the mean value of 0) as we were 483 reluctant to assume that country-level policies, reporting, or health care potential will remain the 484 same in the future. We expect that different country-level effects will dominate in the future, but 485 predicting these offsets is beyond the scope of this study.

488
As with any predictive study, we seek to use the best available data and understanding of 489 mechanisms to develop possible projections that make clear underlying decisions and 490 uncertainty. Ultimately, such predictions must be treated with appropriate caution given the 491 limited understanding of SARS-CoV-2 virus, human resistance, and its transmission dynamics 492 at this time. Thus, while we seek to inform decisions, those decisions must also recognize the 493 inherent uncertainty in any predictive model, but especially in the context of limited information.

494
Future data will ultimately be the arbiter of these predictions, and thus good predictive modeling 495 will require repeated bouts of model validation, revision, and re-projection as we learn more 496 about this virus.

498
In particular, we await mechanistic information on viral physiology and human resistance to 499 move beyond the correlative approach taken here by necessity. Mechanistic models apply 500 insights about an organism's intrinsic biology using parameters often collected from careful 501 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. experimental manipulations. However, in the absence of this information, correlative models can 502 predict near-term dynamics with accuracy (15). Bayesian approaches like ours can integrate both 503 mechanistic and correlative knowledge as these pieces of information become available.

505
One thing that we do not account for in our model is human behavior and control measures. By 506 modeling maximum growth rate and using a threshold number of cases, we restrict our analyses 507 to the period during which the disease expanded quickly, following the beginning of community 508 transmission but before major control measures were implemented. For instance, most countries 509 began implementing national control measures in mid-March, which would influence infections 510 recorded into early April, based on a 14-day window for symptoms to emerge. Hence, we chose 511 to limit our data set to records before April 7. However, we note that following early April, 512 growth rates are expected to be much lower due to control measures, and these will continue to 513 be important to reduce growth rates below the potential values we predict here which do not 514 account for control.

516
We used available insights about SARS-Cov-2, related viruses, and observations of COVID-19 517 dynamics to select a list of factors that likely influence it. Although we purposefully limited 518 these variables to reflect our best knowledge and to avoid overfitting, certainly other climate and 519 epidemiological factors are likely missing from the model. Future studies should consider 520 embedding these climate insights into epidemiological models that include human demography, 521 immunity, movement, behaviors, medical capacity, and control efforts (4). CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020.  Figure S1. Posterior predicted probabilities of growth rate refelect weak trends with environment 529 and high uncertainty in predictions. 530 531 532 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04.19.20071951 doi: medRxiv preprint