Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures

Significance The ongoing pandemic of COVID-19 challenges globalized societies. Scientific and technological cross-fertilization yields broad availability of georeferenced epidemiological data and of modeling tools that aid decisions on emergency management. To this end, spatially explicit models of the COVID-19 epidemic that include e.g. regional individual mobilities, the progression of social distancing, and local capacity of medical infrastructure provide significant information. Data-tailored spatial resolutions that model the disease spread geography can include details of interventions at the proper geographical scale. Based on them, it is possible to quantify the effect of local containment measures (like diachronic spatial maps of averted hospitalizations) and the assessment of the spatial and temporal planning of the needs of emergency measures and medical infrastructure as a major contingency planning aid.

S ince December 2019, a cluster of pneumonia cases in the city of Wuhan, China (1-7), has developed into a pandemic wave currently ravaging several countries (8)(9)(10)(11)(12). The pathogen causing the acute pneumonia among affected individuals is the new coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (8,9,13,14). As of March 25, 2020, a total of 467, 593 cases of coronavirus disease 2019 (COVID-19) have been confirmed worldwide in 181 countries (15). In Italy, a hotspot of the pandemic, the count, as of March 25, 2020, refers to 74, 386 total confirmed cases and 7, 503 deaths (15)(16)(17)(18) (Figs. 1 and 2). The well-monitored progress of the wave of infections highlighted in Fig. 1 (for complete documentation, see SI Appendix and Movies S1 and S2) clearly speaks of decisive spatial effects. Models are often used to infer key processes or evaluate strategies for mitigating influenza/SARS pandemics (5,6,12,(19)(20)(21)(22)(23)(24). Early attempts to model the spread of COVID-19 in Italy (25,26) aired concern regarding the Italian national health system's capacity to respond to the needs of patients (27), even considering aggregate isolation measures. However, mod-eling predictions therein disregard the observed spatial nature of the progress of the wave of infections, and can treat only indirectly the effects of containment measures. Critically, therefore, to deal with what could happen next in terms of forthcoming policy decisions, one needs to deal with spatially explicit models (12,28,29).
We model in space and time the countrywide spread of the COVID-19 epidemic in Italy (Materials and Methods), for which detailed epidemiological data are continuously updated and made public (16,18,30). Data are only a proxy of the actual epidemiological conditions because 1) the number of infected people on record depends on the sampling effort, namely, the number of specimen collections (swabs) from persons under investigation (PUIs) (implications discussed in Materials and Methods, and SI Appendix); and 2) the effects of systematic errors or bias in the official data result mainly in underreporting and need to be considered. In fact, underreporting may apply even to fatality counts, yet to a lesser extent with respect to reported infections. Hospitalizations are known, but may underestimate the actual situation because cases with mild symptoms (termed

Significance
The ongoing pandemic of COVID-19 challenges globalized societies. Scientific and technological cross-fertilization yields broad availability of georeferenced epidemiological data and of modeling tools that aid decisions on emergency management. To this end, spatially explicit models of the COVID-19 epidemic that include e.g. regional individual mobilities, the progression of social distancing, and local capacity of medical infrastructure provide significant information. Data-tailored spatial resolutions that model the disease spread geography can include details of interventions at the proper geographical scale. Based on them, it is possible to quantify the effect of local containment measures (like diachronic spatial maps of averted hospitalizations) and the assessment of the spatial and temporal planning of the needs of emergency measures and medical infrastructure as a major contingency planning aid. asymptomatics in the model) are not hospitalized, for example, due to saturation of the carrying capacity of the sanitary structures. For these reasons, we believe that these major sources of uncertainty could be partially offset by estimating the model parameters by using only reported data on hospitalizations, fatality rates, and recovered individuals, without considering the statistics on reported infections. We concentrate on estimating the effects of severe progressive restrictions posed to human mobility and human-to-human contacts in Italy (Materials and Methods; see also timeline in Fig. 2).
Our quantitative tools (31)(32)(33)(34)(35)(36) are Markov chain Monte Carlo (MCMC) parameter estimation (Materials and Methods) and the extended use of a metacommunity Susceptible-Exposed-Infected-Recovered (SEIR)-like disease transmission model (Materials and Methods) that includes a network of 107 nodes representative of closely monitored Italian provinces and metropolitan areas (second administrative level). We use all publicly available epidemiological data, detailed information about human mobility among the nodes (i.e., fluxes and connections; Materials and Methods), and updates on containment measures and their effects by relying also on mobile phone tracking (37). Their effective implementation is generally a matter of concern (38). As explained in Materials and Methods, the compartments of the model are susceptibles (S ), exposed (E ), presymptom (P ), symptomatic infectious (I ), and asymptomatic infectious (A) (core SEPIA model) (Materials and Methods). The results of parameter estimation allow us to analyze the relative importance of containment measures and of the various epidemiological compartments and their process parameters, which were also discussed in the context of spatially implicit models, for example, in refs. 3-6, 13, 14, 25, 26, and 39. This is true, in particular, for the critical compartments of asymptomatic (5,6,9,28) and of presymptom infectious individuals (see below). As the model is spatially explicit, we implement a generalized reproduction number, that is, the spectral radius of a next-generation matrix (NGM) (35,36,40,41), that measures the potential spread in the absence of containment interventions (Materials and Methods). We also calculate the dominant eigenvalue (and the corresponding eigenvector) of a suitable Jacobian matrix that provides an estimate of the exponential rate of case increase within a disease-free population, and the related asymptotic geographic distribution of the infectious (35,36). In case of time-varying parameters, significant technical complications would arise [e.g., computing Floquet (42) or Lyapunov exponents (43)]. Numerical simulation then supplies directly the desired scenarios in the presence of time-varying containment measures.
A critical issue concerns the description of human mobility that determines exposures and thus, ultimately, the extent of the contagion (28). Although the dense social contact networks characteristic of urban areas may be seen as the fabric for disease propagation, calling for specific treatment of "syn-thetic populations" (44,45), here, because of 1) the large number of cases involved, 2) the countrywide scale of the domain, and 3) the scope of the study aimed at broad large-scale effects of emergency management, we choose to represent node-to-node fluxes from data neglecting demographic stochasticity (but see refs. 14 and 29) and social contact details. Stochasticity is considered through locally estimated seeding of cases surrogating randomness in mobility, which had been considered earlier in the framework of branching processes (14). Coupling this information with the epidemiological data allows us to estimate the effects of enforced or hypothesized containment measures in terms of averted hospitalizations. This yields scenarios on what course the disease might have taken if different measures had been implemented.
Results R0 = 3.60 (95% CI: 3.49 to 3.84) is the estimate of the initial generalized reproduction number, which includes mobility and the spatial distribution of communities (Materials and Methods). The full set of estimated parameters is reported in Table 2, while the comparisons between model simulations and data are shown in Fig. 3 for five representative regions and the whole of Italy (the remaining regions are reported in SI Appendix, Fig. S12). An animation showing the comparison between the simulated and reported spatiotemporal evolution of the outbreak is reported as Movie S2.
As noted in Materials and Methods, a spatially explicit generation matrix KL describes the contributions of presymptom infectious, infectious people with severe symptoms, and infectious      people with no/mild symptoms to the production of new infections close to the disease-free equilibrium. A graph representation of the spatial NGM (Materials and Methods) is shown later (see Fig. 5C). Crucially, the dominant eigenvalue (g0 = 0.24 d −1 [95% CI: 0.22 to 0.26]) of the system's Jacobian matrix, evaluated at the disease-free equilibrium, provides an estimate of the initial exponential rate of case increase. The eigenvector corresponding to the leading eigenvalue, which represents the expected spatial distribution of cases in the asymptotic phase of exponential epidemic growth (35,36), is shown in SI Appendix, Fig. S13. The main result emerging therein is that a completely uncontrolled epidemic would have eventually hit mostly the main metropolitan areas. We estimate that containment measures and changes in social behavior and awareness have progressively reduced the transmission by 45% (95% CI: 42 to 49%). The first set of measures resulted in a reduction of the transmission parameter, βP in Table 2, by 18%, while the second set of measures further reduces it by an additional 34%. Fig. 4 reports, for the whole of Italy, three different scenarios in terms of the cumulative number of hospitalizations. We chose to represent only this state variable for clarity, and for the obvious implications on emergency management. The baseline shown in Fig. 4 is the one in which the model has been identified (lower curve and data) by including changes in the spatial human mobility and in collective social behavior, jointly with their timing (Materials and Methods). The other two curves represent "what if" scenarios. The first (scenario A), corresponding to the middle curve in the graph, is the one in which only the first set of containment measures is implemented. The second (scenario B), portrayed by the upper curve, is obtained by excluding all containment measures. The comparison between scenarios allows us to estimate the number of averted cases (excess of hospitalization demand with respect to the baseline), jointly with their spatial distributions (maps of scenarios A and B in Fig. 4). The actual number of averted cases is obtained by the difference of hospitalizations between the baseline and scenario B (no containment measures). We obtain a median of 0.226 · 10 6 averted cases (95% CI: 0.172 · 10 6 to 0.347 · 10 6 ), as of March 25, 2020.
An analogous plot for the total averted infections is shown in SI Appendix, Fig. S14. Therein, one notes that the total infections are calculated by integrating in time the force of the infection, that is, the sum over all 107 nodes i of the flux (λi Si (t); see Materials and Methods) leaving the susceptibles compartment. The number of averted cases is computed as discussed for the results on hospitalizations in Fig. 4. The median number of averted infections due to the implementation of all restriction measures is 6.49 · 10 6 (95% CI: 4.81 − 10.1 · 10 6 ). Our median estimate of the total number of infections, as of March 25, 2020, is approximately 733, 000 individuals.

Discussion
Globalized societies are challenged by emerging diseases, in many cases, zoonoses (46), often related to climate change (47,48). COVID-19 is a paradigmatic example of zoonosis whose pandemic character is tied to the globalized travel that spread the contagion in a few months (11,12). Scientific and technological advances in a variety of fields provide a broad availability of data and modeling tools that must inform decision-making on emergency management. This exercise intends to contribute to this cross-fertilization.
Here, we have developed and implemented a spatial framework for the ongoing COVID-19 emergency in Italy, which is characterized by evident spatial signatures (SI Movies S1 and S2 clearly show the radiation of the epidemic along highways and transportation infrastructures). Our analysis of the contributions of different compartments points to the important role played by presymptom infectious in the disease spread and growth ( Table 2). The estimated high presymptomatic transmission parameter βP , with respect to the transmission rates of symptomatic and asymptomatic infectious βI ,A, reproduces field epidemiological evidence (49) and provides support for explicitly accounting for the presymptomatic compartment in the SEPIA model. This result may have profound implications for containment measures [possibly even centralized quarantines (50)], because it may suggest the need for a massive swab testing to identify and isolate presymptomatic infectious cases (51). This underpins that greatly improved contact tracing has the potential to stop the spread of the epidemic if reliably used on sufficiently large numbers (52).
The lockdown introduced in Italy by the second set of measures was far more stringent than the first. As a consequence, noted in Results, the transmission rates have been progressively and significantly reduced. The different age of the measures (current time minus its onset) has therefore produced different effects. This needs to be accounted for, to properly judge their effectiveness. At first sight, in fact, the effects of the second set of measures taken in March could erroneously appear less important than in reality (A in Fig. 4). Obviously, the effects of the second set of measures will fully display their importance after March 25, 2020, the end date for our analysis.
Our study presents a number of simplifications and limitations that, however, do not impair our main conclusions. Specifically, 1) although the human effort involved in the collection of epidemiological data has been major, the granularity of available data is limited in time, spatial resolution, and individual information [for instance, the only published assessment of mobility changes in Italy following lockdown (37) refers to publicly unavailable data; properly anonymized call detail records have been useful in other epidemic and endemic contexts (34,53,54)]; 2) should anonymized individual information from hospitals and laboratories be available, a proper probability distribution of relevant rates and periods (e.g., latency, incubation, infection) could be employed by any modeling approaches (see ref. 55 for estimates based on high data granularity regarding the Lombardy region); and 3) the effect of age structure (56) in terms of differential mobility, social contact patterns, vulnerability, and case fatality ratio [often associated with hyperinflammation in elderly people (57)] would need to be included, therefore relying on higher granularity of data (39). Further developments may also deal with operational predictions based on our modeling framework, once coupled, for example, to ensemble Kalman filtering and updates of parameter estimates and state variables, as already customary in other epidemiological studies (58)(59)(60), and currently employed only in a few studies on COVID-19 (28,61). The spatial nature of the model, in fact, would possibly aid the planning of the agenda for differential mobility restrictions and deployments of local medical supplies and staff tuned to local epidemiological and logistic conditions. We do not attempt, at this stage, to simulate the long-term evolution of the disease dynamics, because it depends on the time evolution of the conditions determining critical epidemiological parameters such as people's behavior and contact rates, further restrictions to mobility, or the discovery of new specific antiviral drugs (62  We propose an estimate of total infections computed from our model (SI Appendix, Fig. S14). We find a significantly larger figure than in the official counts: as of March 25, 2020, we estimate a median of about 600, 000 contagions, whereas the official count of confirmed infections is 74, 386. This result does not confirm earlier, much larger estimates (63). However, the estimation of certain key epidemiological parameters proves remarkably similar in ref. 63 and in this paper, possibly providing an avenue for future convergence.
We conclude that a detailed spatially explicit model of the unfolding COVID-19 spread in Italy, inclusive of the imposed restriction measures, closely reproduces the empirical evidence. This allows us to draw significant indications of the key processes involved in the contagion, together with their time-dependent nature and parameters. When applied by restarting the simulation while removing the restrictive measures, the model shows, unequivocally, that their effects have been decisive. Indeed, the total expected number of averted hospitalizations in Italy, a significant measure of the needs of emergency management (and the less error-prone epidemiological measure), ran on the order of 200, 000 cases up to March 25, 2020, for the whole country, and is known with sufficient spatial granularity. Implications on fatality rates and emergency management are direct, as the capacity of the Italian medical facilitiesalthough continuously expanding-is known at each relevant time. Thus our results bear social and economic significance, because they unquestionably support drastic governmental decisions.  Table 1, together with the appropriate references, while the different approaches are described in more detail in SI Appendix.
Here, we propose and use a model that is elaborated moving from the basic local scheme of ref. 5. By introducing the new compartment of presymptomatic infectious individuals, we account for a peculiar epidemiological state of the disease under study. Empirical evidence (see again Table 1) shows, in fact, that the serial interval of COVID-19 tends to be shorter than the incubation period, thus suggesting that a substantial proportion of secondary transmission can occur prior to illness onset (68). Presymptom transmission appears to play an important role in speeding up the spread of the disease within a community, accounting for around 12.6% of case reports in China (49), 48% in Singapore, and 62% in Tianjin, China (74). The core of our model is thus termed SEPIA and includes the following compartments: Susceptible (S), Exposed (E), Presymptomatic (P), Infected with heavy symptoms (I), Asymptomatic/mildly symptomatic (A), Hospitalized (H), Quarantined at home (Q), Recovered (R), and Dead (D) individuals.
The local dynamics of transmission is given bẏ

[1]
In the model, susceptible individuals (S) become exposed to the viral agent upon contact with infectious individuals, assumed to be those in the presymptomatic, heavily symptomatic, or asymptomatic/mildly symptomatic classes. Although the hypothesis might not hold for some very sparse communities, we assume frequency-dependent contact rates (as most authors do), so that exposure occurs at a rate described by the force of infection, where β P , β I , and β A are the specific transmission rates of the three infectious classes. Exposed individuals (E) are latently infected, that is, still not contagious, until they enter the presymptom stage (at rate δ E ) and only then become infectious. Presymptomatic individuals (P) progress (at rate δ P ) to become symptomatic infectious individuals who develop severe symptoms (with probability σ). Alternatively, they become asymptomatic/mildly symptomatic individuals (with probability 1 − σ). Symptomatic infectious individuals (I) exit their compartment if/when 1) they are isolated from the community (at rate η) because a fraction 1 − ζ of them is hospitalized, while a fraction ζ is quarantined at home, 2) they recover from infection (at rate γ I ), or 3) they die (at rate α I ). Asymptomatic/mildly symptomatic individuals (A), on the other hand, leave their compartment after having recovered from infection (at rate γ A ). Hospitalized individuals (H) may either recover from infection (at rate γ H ) or die because of it (at rate α H ), while home-isolated individuals (Q) leave their compartment upon recovery (at rate γ Q ). People who recover from infection or die because of COVID-19 populate the class of recovered (R) and dead (D) individuals, respectively, independently of their epidemiological compartment of origin. The model is made spatial by coupling n human communities at the suitable resolution via a community-dependent force of infection. It results from local and imported infections due to contacts within the local community or associated with citizens' mobility. More precisely, the force of infection for community i is given by where C X ij (with X ∈ {S, E, P, I, A, R}) is the probability ( n j=1 C X ij = 1 for all i and X) that individuals in epidemiological state X who are from community i enter into contact with individuals who are present at community j as either residents or because they are traveling there from community k (note that i, j, and k may coincide). Details are provided in SI Appendix.
A frequently used indicator is the basic reproduction number, namely, the number R 0 of secondary infections produced by one primary infection in a fully susceptible population. This simple concept works fine in a spatially isolated community, where everything is well mixed at any instant. Instead, if the model parameters are inhomogeneous both in space and in time, the number of secondary infections produced by one primary infection might vary accordingly. Also, R 0 may depend on people's behavior and on the control measures being enforced. When a realistic spatial model is introduced to describe the spread in a country, it is necessary to resort to the definition of generalized reproduction numbers based on the spectral radius of a suitable epidemiological matrix (35,36,40).
If we consider the spatial model described above in the case when no emergency measures are enforced and people's behavior does not change, then the basic reproduction number can be calculated as (see SI Appendix for the detailed derivation) where ρ(K L ) is the spectral radius of the NGM (40) and are three spatially explicit generation matrices describing the contributions of 1) presymptom infectious, 2) infectious with severe symptoms, and 3) infectious with no/mild symptoms, to the production of new infections close to the disease-free equilibrium. The matrices C X = [C X ij ] (X ∈ {S, P, I, A}) are row stochastic (i.e., their rows sum up to one) and represent spatially explicit contact probabilities. Matrix G = NC S ∆ −1 is constructed as follows: N is a diagonal matrix whose nonzero elements are the population sizes N i of the n communities, C S is the contact matrix for susceptibles, and ∆ = diag(uNC S ), with u being a unitary row vector of size n. Matrix K L is a spatially explicit NGM, whose spatial structure describes the main routes of spatial propagation of the epidemic. Also, the dominant eigenvalue (and the corresponding eigenvector) of the system Jacobian matrix, evaluated at the disease-free equilibrium, provides an estimate of the initial exponential rate of case increase, and the related asymptotic geographic distribution of the infectious (35,36).

Data
Available Data and the Course of the Epidemic. Here, we use the data released every day at 6 PM (UTC +1 h) by the Dipartimento della Protezione Civile and archived on GitHub (75). At times, data may be just a proxy of the actual state variables. In particular, the number of infected people (be they exposed, presymptomatic, symptomatic, or asymptomatic) depends on the effort being devoted to finding new positive cases, namely, the number of specimen collections (swabs) from PUIs. The standard methodology employed by the Istituto Superiore di Sanit (ISS) for confirming a suspected case is the one used by the European Centre for Disease Prevention and Control (76). According to the bulletin of the ISS (17), a median time between the beginning of symptoms and the confirmed diagnosis (positive swabs) ranges between 3 d and 4 d. Sometimes, however, people test positive even without displaying symptoms (e.g., they are tested because they were in contact with symptomatic infectious). Therefore, it seems that the number of positive swabs may not provide a reliable indication of the number of exposed, and probably little indication of the number of presymptom individuals. Actually, these data seem to provide an idea about the number of people who are infectious and have developed mild symptoms (isolated at home) or more serious symptoms (hospitalized), but much less about those with very mild symptoms who are not always subjected to a test.
Measures for Mobility Restrictions and Contact Reduction. The detailed sequence of progressive restrictions posed to human mobility and human-to-human contacts in Italy may be summarized as follows: A) On February 18, 2020, a patient (dubbed "patient one" by Italian media outlets) is admitted to the emergency room in Codogno (Lombardy, province of Lodi) for pneumonia. B) On February 21, 2020 (day 1), "patient one" is officially confirmed as a case of COVID- 19  severe limitations to mobility for the whole nation are instituted.
Model Implementation and Parameter Estimation. The model has been implemented at the scale of the second administrative level (mainly provinces and metropolitan areas), which comprises 107 units. Therefore, census mobility fluxes available at the municipal level (7,904 entities) were upscaled to the provincial level (SI Appendix). Matrices CX = [C X ij ] (X ∈ {S , E , P , I , A}) are derived from the mobility data.
We explicitly reproduce in our simulations the effects of the restriction measures described above by 1) restricting access and exit from the red areas (SI Appendix, Figs. S5-S7), starting from February 23, 2020, and 2) reducing the fraction of people traveling outside the resident province according to data collected through mobile applications and presented in ref. 37. To simulate the change in social behavior and the increase in social  N (a, b) being a normal  distribution of average a and SD b, and U (a, b)  distancing, we assume that the transmission parameters βP , βI , and βA had a sharp decrease (within 2 d) after the measures announced on February 24 and March 8, 2020, and we estimate those step reductions (Table 2). It should be noted that the reduction in the transmission parameters is due not only to the implementation of restriction measures (e.g., school and office closures) but also to the increased awareness of the population, especially after the first cases were reported. Model parameters are estimated in a Bayesian framework by sampling the posterior parameter distribution via the DREAMzs (77) implementation of the MCMC algorithm. As testing effort and quarantine policy vary across different Italian regions, we prefer to focus on more reliable variables like the number of hospitalized people, deaths, and patients discharged from the hospital. Specifically, we define the likelihood based on daily numbers of hospitalized cases (flux ηI ), discharged from hospital (γH H ), and recorded deaths (αH H ) at the province level. To account for possible overdispersion of the data, we assume that each data point follows a negative binomial distribution (78,79) with mean µ, equal to the value predicted by the model, and variance equal to ωµ (NB1 parametrization). We estimated the parameter ω.
To account for the temporal evolution of the epidemics prior to the first detected patient, we impose an initial condition of one exposed individual in the province of Lodi (where the first cases emerged) ∆t0 days before February 24, 2020, and we estimate this parameter. During this period, the disease was likely seeded into other provinces via either human mobility or importation of cases from abroad. The process during this period was likely characterized by high demographic stochasticity due to the low number of involved individuals, and thus it can hardly be captured by our deterministic modeling of average mobility and disease transmission. Moreover, long-distance travels and importation of cases are not accounted for in the data used to represent human mobility, which mostly reflect commuting fluxes for work and study purposes. Therefore, to include this possible seeding effect, we estimated also the initial condition in each province. Specifically, this is done by seeding a small fraction of exposed individuals at the beginning of the simulation.
The list of estimated parameters is reported in Table 2. The parameter βP is expressed as a function of the local reproduc-tion number R0 (SI Appendix). βP 1 and βP 2 represent the values of the parameter βP after the measures introduced on February 22 and on March 8, 2020, respectively. The fraction of symptomatic infected being quarantined, ζ, is assumed to be equal to 0.4, that is, the average value for Italy during the observed period (17). During preliminary tests, we found a correlation between the asymptomatic fraction (1 − σ) and the asymptomatic transmission rate βA. Indeed, in the early phase of an epidemic, when the depletion of susceptible is not significant, it is difficult to estimate the role or asymptomatics. We therefore fixed σ to a reasonable value (σ = 0.25; see, e.g., ref. 80) and estimated βA. The parameter rX represents the fraction of total personal contacts that individuals belonging to the X compartment have in the destination community (SI Appendix). We assume rS = 0.5 (i.e., each individual has, on average, half of the contacts in the place of work or study) and that rE = rP = rA = rR = rS , while rI = rQ = rH = 0 (no extra province mobility of symptomatic infected, quarantined, and hospitalized individuals). Further assumptions aimed at reducing the number of parameters to be estimated are γQ = γI = γH , γA = 2γI , and αH = αI . We use information summarized in Table 1 to define prior distributions of key timescale parameters ( Table 2). Moreover, the viral load of symptomatic cases is reportedly similar to that of the asymptomatic (81). We use such information to define the prior of the ratio βI /βA. Data Availability. All data used in this manuscript are publicly available. COVID-19 epidemiological data for Italy are available at https://github.com/pcm-dpc/COVID-19. Mobility data at municipality scale are available at https://www.istat.it/it/archivio/ 139381. Population census data are available at http://dati.istat. it/Index.aspx?QueryId=18460.