Nonlinear bias toward complex contagion in uncertain transmission settings

Edited by Alan Hastings, University of California, Davis, CA; received July 18, 2023; accepted November 24, 2023
December 28, 2023
121 (1) e2312202121

Significance

Contagion dynamics are usually separated into two classes: simple contagion, used notably to describe the spread of infectious diseases, and complex contagion, mainly used to model the spread of certain social phenomena. A distinguishing feature of simple contagion is that the rate of infection of an individual is proportional to the number of exposures—being in contact with two infectious individuals doubles the rate. Complex contagions on the other hand allow different nonlinear rates of adoption. In this paper, however, we show that an imperfect knowledge of the transmission settings, like how fast a disease spreads in different environments, blurs the line between the two by introducing a systematic nonlinear bias in the rate of adoption.

Abstract

Current epidemics in the biological and social domains are challenging the standard assumptions of mathematical contagion models. Chief among them are the complex patterns of transmission caused by heterogeneous group sizes and infection risk varying by orders of magnitude in different settings, like indoor versus outdoor gatherings in the COVID-19 pandemic or different moderation practices in social media communities. However, quantifying these heterogeneous levels of risk is difficult, and most models typically ignore them. Here, we include these features in an epidemic model on weighted hypergraphs to capture group-specific transmission rates. We study analytically the consequences of ignoring the heterogeneous transmissibility and find an induced superlinear infection rate during the emergence of a new outbreak, even though the underlying mechanism is a simple, linear contagion. The dynamics produced at the individual and group levels are therefore more similar to complex, nonlinear contagions, thus blurring the line between simple and complex contagions in realistic settings. We support this claim by introducing a Bayesian inference framework to quantify the nonlinearity of contagion processes. We show that simple contagions on real weighted hypergraphs are systematically biased toward the superlinear regime if the heterogeneity of the weights is ignored, greatly increasing the risk of erroneous classification as complex contagions. Our results provide an important cautionary tale for the challenging task of inferring transmission mechanisms from incidence data. Yet, it also paves the way for effective models that capture complex features of epidemics through nonlinear infection rates.
Models of epidemics on networks allow us to account for the complex contact structures found within human populations (1), albeit at the price of having to make significant simplifying assumption. Most commonly, we assume that there is a linear relationship between exposure and the rate of infection and that the slope of this relationship is accurately captured by an average transmission rate (2). The ongoing COVID-19 pandemic challenges these assumptions as the risk of transmission has been shown to vary 20-fold or more between indoor and outdoor settings (3) and that simple differences in indoor ventilation can also greatly affect the risk of transmission (4). Similarly, the environmental media will affect the different modes of transmission of influenza A viruses (5), leading to variable risks of infection. Therefore, there is no single “transmission rate” for diseases like influenza or COVID-19 since it varies across spaces and activities simply due to the physical settings of the interactions.
Heterogeneous transmission is not unique to respiratory diseases, the context of contacts always matters for biological pathogens—famously so for sexually transmitted infections (6, 7)—and is perhaps even more relevant for the study of social contagions (8). For instance, individuals might behave or express themselves in different ways in different groups (9). One well-studied example is that of positive feedback between affective sharing and similarity attraction among group members (10), where individuals might share more with people they find similar to themselves. Indeed, recent models now attempt to include the impacts of such context-dependent behavior on epidemic dynamics (11).
One critical lesson from the study of complex systems is that a quantity that varies across orders of magnitude is unlikely to be well described and captured by its mean as many models often assume. Despite evidence that context-specific transmission is a key feature of contagions of all sorts, there is currently no general approach to model these dynamics.
Modeling heterogeneous transmission rates in different settings is challenging because they induce important dynamical correlations between agents found in that setting (12). For instance, consider an office building with an inadequate ventilation system; not only is transmission increased around infectious individuals, but we are also more likely to find infectious individuals in this building given that people there work in a setting with bad ventilation. Models thus need to capture two important features: heterogeneity in transmission rates across settings and dynamical correlations between the epidemiological states of individuals found in these different settings.
Accounting for these heterogeneities and correlations in disease spread can be done in multiple ways. We can use stochastic simulations on networks to fully capture the structure of contacts and of the context-dependent transmission rates of infectious diseases. Here, we mainly turn toward recent advances in the modeling of higher-order networks (13) and use weighted hypergraphs in an approximate master equation framework (14) to capture these very same features. We show that these two approaches are equivalent, but the latter allows us to unravel the complex dynamics produced by the heterogeneous transmission.
More precisely, we show that heterogeneous transmission rates across settings can be captured using a superlinear infection rate at the level of groups. For instance, if there are many more infectious individuals in an office building than expected on average, one can infer that the local transmission rate is likely greater than the mean. This leads to a local transmission rate that varies with the number of infectious, and the functional forms can be rich and varied depending on the underlying heterogeneity. We demonstrate how to perform this mapping in the context of both archetypal Susceptible-Infectious-Susceptible (SIS) and Susceptible-Infectious-Recovered (SIR) dynamics.
As we derive these results in the next sections, it is important to keep in mind the potential impacts of this nonlinearity. Notably, using a Bayesian inference framework, we show that data produced by systems with heterogeneous rates lead to a systematic nonlinear bias in the inferred infection rate function if heterogeneity is not taken into account. This is usually interpreted as an indicator of complex contagions (15, 16) or, more recently in the network science community, an indicator of higher-order interactions (13). Therefore, without a careful treatment of heterogeneity in the transmission settings, distinguishing simple and complex contagions becomes impractical. This may come as a surprise since complex contagion with superlinear infection rates typically leads to dramatically different outcomes than linear ones (2, 1719). However, the nonlinearity induced by heterogeneous rates is only an effective model, and as we demonstrate here, is solely valid in a limited time window. In a nutshell, the fact that heterogeneous transmission rates map to a superlinear infection rate is a cautionary tale for mechanistic inference but also an opportunity for practitioners to improve models, forecasts, and interventions. We provide examples to highlight these different aspects in what follows.

Results

Contagion Models.

We consider infinite-size random higher-order networks: Nodes belong to groups of size n and each node has a membership m, corresponding to the number of groups in which it participates. The ensemble is characterized by a group size distribution P(n)pn and a membership distribution P(m)qm. Nodes are assigned to groups uniformly at random, and there are no correlations between m and n. Also, each group of size n possesses some intrinsic transmission rate variable λ[0,) drawn from a conditional probability density function P(λ|n)pλ|n.
Formally, we are describing an ensemble of random weighted hypergraphs, where λ is the weight associated with a group [Fig. 1A]. We therefore refer to a specific group type by the pair (n,λ), with joint probability density P(λ,n)=P(λ|n)P(n)pλ,n.
Fig. 1.
Approximate master equations provide an accurate description of contagion on weighted hypergraphs. (A) Example of a weighted hypergraph, resulting in heterogeneous group transmission. (B) and (C) Validation of the theoretical framework against simulations. We use a homogeneous group size pn=δn,10 and a homogeneous membership qm=δm,20. We use a discretized Weibull distribution with ν=1, μ=6.5×103 for pλ|n (Materials and Methods, Section C), and 500 points evenly spaced on the interval λ[104,0.5] were chosen to create ω. The solid lines correspond to the numerical integration of the approximate master equations of the complete dynamical system [see Eq. 1 for the SIS model; see Materials and Methods, Section E, for the SIR model]. Circles correspond to median values of 10 stochastic simulations on a large random network containing 2×105 groups; the error bars correspond to the 50% prediction interval. Runs where the epidemic did not take off were discarded.
On these higher-order networks, we consider simple contagion processes in which each node is either infectious, susceptible, or recovered. Below, we mainly focus on the Susceptible-Infectious-Susceptible model in which infectious individuals who recover immediately become susceptible again. Equivalent derivations for the Susceptible-Infectious-Removed model are provided in Materials and Methods (Section E).
In a group of type (n,λ) with i{0,,n} infectious nodes, each of the ni susceptible nodes become infectious at a linear rate λi. Infectious nodes recover at a rate set to 1 without loss of generality. We denote by Sm(t) the fraction of nodes that are of membership m and are susceptible at time t, and denote by Gn,iλ(t) the fraction of all groups that are of type (n,λ) with i infectious nodes at time t. The evolution of these quantities is governed by the following system of approximate master equations (2)
dSmdt=qmSmmrSm,
[1a]
dGn,iλdt=(i+1)Gn,i+1λiGn,iλ+(ni+1){λ(i1)+ρ}Gn,i1λ(ni){λi+ρ}Gn,iλ.
[1b]
This system is technically infinite dimensional because of λ. However, if we assume a discretization such that λω with |ω| finite, then the system contains a total of O(mmax+|ω|nmax2) equations, where mmax and nmax are the maximal membership and maximal group size respectively.
In Eq. 1a, the evolution of each Sm is treated in a heterogeneous mean-field fashion (1): The first two terms describe infectious nodes recovering at unit rate (they represent a fraction ImqmSm of all nodes), while the third term corresponds to new infections of susceptible nodes member of m groups, with r(t) the average rate of infection within each of these groups. In Eq. 1b, the evolution of each Gn,iλ is described using a master equation characterizing the inflow and outflow of probabilities associated to all possible states—all possible number i of infectious—for a group of type (n,λ). The first two terms describe infectious nodes recovering at a unit rate, while the last two correspond to new infections. The infection rate due to infectious nodes within the group is treated exactly (i.e., the terms involving λ), while the contribution of all other groups to which a susceptible node belongs is approximated by the average infection rate ρ(t). We thus call this an approximate master equation system.
The mean-field quantities r(t) and ρ(t) are calculated as
r(t)=n,i0λi(ni)Gn,iλ(t)dλn,i0(ni)Gn,iλ(t)dλ,
[2a]
ρ(t)=r(t)[mm(m1)Sm(t)mmSm(t)].
[2b]
Note that unless specified otherwise, sums over m (n) are over every value such that qm>0 (pn>0), and sums over i cover the range {0,,n}. The estimation of r(t) corresponds to the average rate of infection for a susceptible node in a group, which is calculated by averaging over groups proportionally to their number of susceptible, (ni). We then estimate ρ(t) by multiplying r(t) with the expected number of other groups a susceptible node in a group belongs to. The membership distribution of a susceptible node in a group is proportional to mSm—because of the friendship paradox—and the number of other groups is m1.
The global prevalence (average fraction of infectious nodes) is then measured as
I(t)=mIm=m[qmSm(t)].
In Figs. 1 B and C, we show the accuracy of our framework compared to Monte Carlo simulations, for both the SIS and SIR models.
Note that we model contagion on a quenched (static) hypergraph representing the backbone of social interactions, but the formalism allows more flexibility. We could choose other forms for ρ(t), for instance, to represent dynamically changing random interactions—for instance, random encounters at the grocery store—more in line with standard mass action models. Therefore, ρ(t) can be seen as a general mean-field term that couples otherwise isolated group interactions. Let us emphasize that other forms of coupling would not change the main results in this paper, which mainly concern the local group dynamics.
A clear limitation of our theoretical framework however is the hypothesis of a randomized structure. As we will show, our results still hold quite well for general higher-order networks, but already one can envision generalization of this work to other formalisms incorporating more structural features. Compartmental approaches taking into account degree-based correlations (20) and individual-based mean-field approaches (1, 21, 22) could potentially fill this gap; it is, however, essential that they not only describe accurately the network but also take into account local dynamical correlations, as in ref. 11, a crucial element for what will unfold.

Characterization of the Effective Transmission Rate.

The system of ODEs given by Eq. 1 is highly resolved and of high dimension—let us call Gn,iλ the complete partition. While data on group interactions are possible to extract (membership and size), the strength of these interactions which will dictate the local transmission rate λ is much harder to measure. This likely explains why typical models ignore such heterogeneity and use a homogeneous transmission rate λ¯. While this modeling assumption is standard practice, we show here that it systematically transforms the infection rate into a superlinear function.
To model heterogeneous group transmissibility with a homogeneous rate, we need to average over the transmission rate, without losing the correlation between the state of a group and the underlying local transmission rate. Namely, we focus on the following coarse-grained system
dSmdt=qmSmmrSm,
[3a]
dGn,idt=(i+1)Gn,i+1iGn,i+(ni+1){λ¯n,i1(i1)+ρ}Gn,i1(ni){λ¯n,ii+ρ}Gn,i,
[3b]
where Gn,i=Gn,iλdλ is the coarse-grained partition, and where λ¯n,i(t) is the effective transmission rate in a group of size n where i nodes are already infectious. Note that the definition of ρ(t) remains the same [Eq. 2b], but that we redefine
r(t)=n,iλ¯n,i(t)i(ni)Gn,i(t)n,i(ni)Gn,i(t).
[4]
There is no approximation involved when passing from Eq. 1b) to Eq. 3b. However, the complexity of the complete system is now hidden inside the effective transmission rate
λ¯n,i(t)Eλ|n,i=0λGn,iλ(t)dλ0Gn,iλ(t)dλ.
[5]
The exact description of the temporal evolution of the coarse-grained model [Eq. 3] requires the evaluation of the effective transmission rate at Eq. 5, which depends on the complete partition Gn,iλ(t). However, in the early stage of an epidemic when only a vanishing fraction of individuals are infectious, the population is essentially healthy, and therefore Gn,iλ(t)vn,iλeΛt, where vn,iλ is the leading eigenvector of the Jacobian matrix and Λ is its associated eigenvalue (Materials and Methods, Section D). An important thing to notice is that the temporal term eΛt is decoupled from the term depending on λ, which is just vn,iλ. Therefore, the effective transmission rate at the beginning of an outbreak simplifies to
λ¯n,i0λvn,iλdλ0vn,iλdλ,
[6]
which is time invariant. Unless vn,iλ is sharply peaked around a value λ, the resulting infection rate iλ¯n,i will be a nonlinear function of i, the number of infectious nodes in the group.
It is worth underscoring that nonlinear rates or activation functions have been associated with complex contagions for some time (16, 23, 24). More recently, nonlinear infection mechanisms at the level of groups (19) were shown to be an equivalent formulation for simplicial and hypergraph contagion models (20, 22, 2529) which have been actively studied in the past few years. In these processes, the contagion is transmitted through both pairwise and higher-order interactions involving more than two nodes when all but one is infectious. In the context of simplicial contagion, for instance, the infection rate within a group—associated with a simplex—becomes a combinatorial sum of all active transmission channels. However, this can be transformed into a generic nonlinear function of i, the number of infectious nodes in the simplex (see Materials and Methods, Section A for an explicit mapping). In essence, the effective nonlinear infection rate we find by averaging over group transmission leads to a mechanism we would associate with generic complex contagion models, but also with this more recent perspective of higher-order contagion.
Fig. 2A illustrates the temporal evolution of the effective transmission rate for a network with groups of size n=10. We thus focus on the dependence on i, i.e., λ¯n,i(t)λ¯i(t). We see that the eigenvector (EV) approximation of Eq. 6 captures accurately the effective transmission rate for a long time at the beginning of an epidemic. Notice here that the effective transmission rate increases approximately linearly with i, which results in a superlinear infection rate iλ¯i at the level of groups.
Fig. 2.
Contagion in heterogeneous transmission settings is best described by an effective transmission rate function of the local prevalence in a group. We use the same network as in Fig. 1 B and C. (A) Temporal evolution of the effective transmission rate λ¯i(t). The three solid lines correspond to the exact effective transmission rate [Eq. 5] measured at different times (SIS only; similar behavior was observed for SIR). The dashed line shows the eigenvector (EV) approximation of Eq. 6). (B) and (C) The solid lines correspond to the numerical integration of the complete dynamical system [Eq. 1]. The dashed lines correspond to the numerical integration of the coarse-grained dynamical system [Eq. 3] using different approximations for λ¯i. The EV rate uses Eq. 6 and the mean rate assumes a homogeneous rate E[λ] for the groups.
The EV effective transmission rate combined with the coarse-grained dynamical system [Eq. 3b] capture the early phase of an outbreak, as seen in Fig. 2 B and C for the SIS model and the SIR model (Materials and Methods, Section E), as opposed to simply considering a mean effective rate λ¯iE[λ]. However, when a sufficiently large portion of the population has been infected, the EV approximation breaks [see Fig. 2A, t=40]. At that point, the coarse-grained approximate system predicts a superexponential growth in Fig. 2 B and C—typical of models with a superlinear infection rate (2). This is not a realistic feature of the underlying system, however. Note that we observe a similar behavior with other group structures and rate distributions (SI Appendix).
Let us try to better understand why we obtain this functional form of effective transmission rate in Fig. 2, how this varies with the rate distribution, and how it breaks when a sufficient number of nodes have been infected. The leading eigenvector does not possess an explicit analytical form in general, but near the critical point, vn,iλ is proportional to the stationary distribution Gn,iλ for i>0 (Material and Methods, Section D), which on the other hand possesses an explicit analytical form. Therefore, even though it seems counterintuitive since we aim to describe the early phase, the stationary state (t) of the SIS model provides helpful analytical insights.
Enforcing detailed balance, we find that the stationary state of the complete dynamical system [Eq. 1b] is
Gn,iλ(ρ)=Gn,0λnij=0i1[λj+ρ]i{1,,n},
[7]
with Gn,0λ=pλ,ni=1nGn,iλ (Materials and Methods, Section B). The stationary effective transmission rate is then obtained by injecting Eq. 7 into
λ¯n,i(ρ)=0λGn,iλ(ρ)dλ0Gn,iλdλ.
[8]
If the system is arbitrarily close to the critical point (akin to the “low temperature” limit in statistical physics), then Gn,0λpλ,n and ρ0. In this case, we develop Gn,iλpλ,nδi,0+hn,iλρ, where
hn,iλdGn,iλdρρ0=pλ,nniλi1Γ(i),
[9]
for all i{1,,n} and Γ(·) is the gamma function. Therefore, for i>0, we obtain the following critical effective transmission rate
λ¯n,i(0)limρ0λ¯n,i(ρ)=Eλi|nEλi1|n.
[10]
The critical effective transmission rate is a ratio of consecutive moments of pλ|n, and therefore depends on i. In fact, unless pλ|n=δ(λλn), it will be an increasing function of i, meaning that the rate of infection is superlinear.
The effective transmission rate captures the fact that if a group has a large number of infectious members, i, this is probably because the underlying transmission rate λ is large as well. Let us illustrate this conclusion with a simple example in which pλ|n is a bimodal distribution pλ|n=[aδ(λλ1)+(1a)δ(λλ2)], with λ2>λ1. In this case,
λ¯n,i(0)=λ1b+γib+γi1,
[11]
where ba/(1a) and γλ2/λ1>1. If γib, then λ¯n,i(0)λ1, whereas λ¯n,i(0)λ2 if γib. The effective transmission rate is sigmoidal, with a soft threshold value around ı̂=logγ(b). If i<ı̂, the local rate is probably λ1, and if i>ı̂, then the local rate is probably λ2. In other words, our framework implicitly infers whether a group with i infectious members most likely possesses a local transmission rate λ1 or λ2—indeed, another way to interpret Eq. 5 is as the posterior mean for the transmission rate λ.
Models of complex spreading often impose a similar threshold on the adoption rate—or probability—, separating a low and high regime of adoption (15, 23, 24). The rationale behind this threshold is that the benefits of adopting the social norm only become significant if a critical mass of individuals has already adopted it. This type of positive feedback mechanism is often called social reinforcement. Here, it emerges as an effective mechanism by averaging over the underlying heterogeneity.
Let us now consider more realistic rate distributions. Since we know that the ratio of consecutive moments in Eq. 10 is mostly affected by the tail of pλ|n, Fig. 3 illustrates three cases of effective rate derived from distributions with increasingly heavier tails: the Weibull, the lognormal, and the Fréchet distributions. Additionally, since the theory works at all sizes n, let us consider a group of moderate size n=20 and focus on the variation of the stationary effective transmission rate λ¯n,i(ρ)λ¯i(ρ) as a function of i.
Fig. 3.
Various types of heterogeneity lead to diverse functional forms for the effective transmission rate. We use n=20 and different underlying rate distributions pλ|n with μ=102 (Materials and Methods, Section C), which is then used to evaluate the stationary effective rate from Eq. 8. Each curve corresponds to a different value of ρ[105,103] logarithmically spaced. One way to interpret ρ is as the coupling between the groups—higher values mean infection from external groups is more likely. (A) Weibull rate distribution with ν=1. For small ρ, we observe λ¯iiν. (B) Lognormal rate distribution with ν=0.15. For small ρ, we observe λ¯ieνi. (C) Fréchet rate distribution with ν=0.1 and λmax=0.1. For small ρ, λ¯i behaves like step function, with threshold at ı̂=1/ν (dotted vertical line).
In Fig. 3A, we show that the Weibull distribution yields an effective transmission rate that is approximately power-law for small ρ, i.e., λ¯i(0)iν. This observation explains the linear effective transmission rate in Fig. 2A, where we use ν=1. The resulting infection rate is also a power-law iν+1. This type of model has been studied initially at the population level (17) using the mass-action approximation and has been shown to represent the synergistic interaction of supercritical diseases (18). More general power-law activation functions have been used to model language dynamics (30) and can emerge from the combination of temporal heterogeneity and threshold dynamics (2), the cornerstone of most social contagion models.
In Fig. 3B, we see that the lognormal distribution produces an effective transmission rate that is approximately exponential, i.e., λ¯i(0)eνi. Although less common as far as we know, similar effective transmission rates can emerge from the synergistic interaction of otherwise subcritical diseases in a population (18).
In Fig. 3C, we consider the even more heterogeneous Fréchet distribution—which has a power-law tail—and we recover a sigmoidal effective transmission rate, akin to the bimodal case explained above. In this specific case, the distribution is so heterogeneous that our analytical approach effectively infers two parts: groups either belong to the bulk of the rate distribution or to the tail. The soft threshold separating the regimes is now directly related to the exponent of the cumulative distribution function (Materials and Methods, Section C).
While Eqs. 6 and 10 characterize the effective transmission rate in the early stage of an epidemic, the EV approximation eventually breaks, as seen in Fig. 2. To understand why we look at the other limit case, ρ (akin to the “high temperature” in statistical mechanistic), which is equivalent to the scenario where almost everybody in the population is infectious, Gn,iλpλ,nδn,i, for all λ. Thus Eq. 8 becomes
λ¯n,i()limρλ¯n,i(ρ)=E[λ|n].
[12]
In this limit, the number of infectious nodes i does not affect the effective transmission rate. In other words, dynamical correlations do not matter in this limit. Again, we can appeal to the “statistical inference” interpretation of our effective transmission rate: If the rate of infection by external groups (ρ) is very large, it is impossible to gain information about the local transmission rate from the current group state.
As predicted, all cases explored in Fig. 3 have a rate independent of i in the limit of large ρ. However, it is worth mentioning that this limit is out of reach for most systems: Eq. 2b shows that ρ is, in general, a finite quantity. This explains why, in Fig. 2A, λ¯i is not independent of i for large t—the effective transmission rate rather takes a complicated nonlinear form, in between the low and high-temperature limits, better represented by intermediate values of ρ in Fig. 3.

Pitfall for Mechanistic Inference and the Identification of Complex Contagions.

Our framework predicts a superlinear rate of infection in the early phase of an outbreak if we coarse-grain or average transmissions over groups, even though the true underlying contagion is linear. This systematic bias has important implications for parameter inference and the identification of complex contagion from time series (15, 31, 32). To complement our theoretical results, we introduce a Bayesian inference framework (Fig. 4) to quantify the nonlinearity of contagion processes.
Fig. 4.
Bayesian inference pipeline to quantify the nonlinearity of a contagion process. As an input, we generate simulations of simple contagion using the same network and contagion parameters as in Fig. 1B. (A) Prevalence of 10 SIS simulations. (B) Joint posterior distribution for the parameters of the model using a superlinear infection rate βiα, inspired by the form of the effective transmission rate for Weibull rate distribution. The joint posterior is obtained for the red curve in (A). We use the likelihood for the sequence of states and the time of state transition up to the first time the system reaches a prevalence of I(t)=103. We then extract a posterior distribution using a flat prior on the parameters. (C) We marginalize the joint posterior over β to obtain the distribution for the exponent of the superlinear infection rate for each simulation in (A).
Let us consider simulations of the SIS model on a network with heterogeneous group transmission as our evidence. We use the full sequence of states Y=yttT in the early phase of the epidemic, where yt is the vector of the states of all nodes at time t and T is the first time the prevalence reaches a value I(T)103 [Fig. 4A]. Ignoring the heterogeneous group transmission, we suppose a nonlinear infection rate of the form βiα. We infer the parameters β,α using the posterior distribution
P(β,α|Y)P(Y|β,α)P(β,α),
[13]
Here, we use a flat prior distribution P(β,α)=const., and the likelihood P(Y|β,α) is evaluated using Eq. 32 in Materials and Methods, Section F.
We first validate the framework with synthetic networks and a Weibull distribution of group transmission with shape parameter ν=1. Fig. 4A illustrates the time evolution of the prevalence for each simulation. For the simulation corresponding to the red curve in Fig. 4A, we show the joint posterior distribution in Fig. 4B, which clearly suggests a superlinear rate of infection (α>1). For each simulation, the marginal distribution on the exponent α in Fig. 4C is consistent with our prediction for a Weibull distribution of group transmission rate, i.e., αν+1.
Fig. 5 shows the results of the same experiment but on real hypergraphs (Materials and Methods, Section G). The results shown in Fig. 5A were obtained from simulations on a hypergraph constructed from coauthorship data [Fig. 5A], but with a synthetic Weibull group-transmission distribution with different values of shape parameter ν. The relation αν+1 no longer holds due to structural correlations neglected by our approach, which is where other formalisms (29) could provide improvements on our result. Nevertheless, contagions with heterogeneous group transmission remain much more accurately described by a superlinear rate of infection, and increasing the heterogeneity (increasing ν) leads to a larger exponent α, as predicted by our theoretical framework. In Fig. 5 B and C, we use weighted hypergraphs constructed from high-school contacts and email exchanges (3336). The weights of the groups in both datasets are very heterogeneous, making it an ideal case study for our framework (SI Appendix). Again, for all simulations, we obtain a clear signal of superlinear contagion. In SI Appendix, we further validate that our results are robust to a change of functional form for the infection rate.
Fig. 5.
Complex contagions are erroneously inferred from simple contagion processes on real hypergraphs. We use the same procedure as in Fig. 4 assuming a superlinear infection rate βiα. (A) We use a hypergraph constructed from coauthorship data (33), but we impose a Weibull group transmission distribution. We obtain the marginal posterior distribution on α using different values of ν for the Weibull distribution. Each solid line represents a different simulation. (B) We use a hypergraph constructed from high-school contact patterns measured with wearable sensors (33, 34). (C) We use a hypergraph constructed from email exchanges within a large European research institution (33, 35, 36). (B and C) The resulting weighted hypergraphs are aggregated static versions of the original temporal hypergraphs, with the weight of a group proportional to the total number of interactions. Since the hypergraphs have fewer nodes (327 and 1,005), we perform the inference on a sequence of states up to the first time the system reaches a prevalence I(t)=101. See Materials and Methods, Section G for more information on the hypergraphs.
Altogether, Figs. 4 and 5 provide evidence of a dangerous pitfall for those trying to identify complex contagion from time series data. One could easily conclude erroneously that social reinforcement or other mechanisms are important factors influencing an observed contagion process, while in fact, ignored heterogeneity in transmissibility could potentially explain the apparent nonlinearity. In fact, we find that real weighted hypergraphs robustly create simple contagion dynamics that look complex once aggregated over groups.

Discussion

We developed an approximate master equation framework to capture the dynamics of contagions whose transmission rates vary arbitrarily across groups or settings. In doing so, we showed that once collapsed on an average rate of transmission, the dynamics of these contagions are mapped to superlinear rates of infection, incidentally blurring the line between simple and complex contagions in realistic settings
Interestingly, several other mechanisms can produce particular cases of the superlinear infection rates shown here. Interacting contagions can produce nonlinear dynamics that resemble the one produced by a simple contagion with a Weibull or a lognormal distribution of transmission rates [figure 1 of ref. 18]. Bursty interaction patterns between individuals and groups have also been shown to lead to power-law rate of infection (2), akin to what we observe here with a Weibull distribution of transmission rates. Perhaps most importantly, complex contagion mechanisms taking the form of threshold dynamics are used widely to model social contagion (37): Here, we show that it can be reproduced using a bimodal distribution of transmission rates, or a very heterogeneous one.
On the modeling side, the fact that multiple mechanisms can lead to a similar model is not problematic per se. In physics, this is usually celebrated as one is able to claim the universality of the resulting model. However, one distinguishing feature of the superlinear rate of infection induced by heterogeneous group transmission is that it is stable for long periods of time, as shown in Fig. 2, but it eventually breaks. This contrasts with other mechanisms that produce a nonlinear infection rate that is truly time-invariant. Therefore, nonlinear rates of infection are to be used with caution: One could calibrate a particular model early in an emerging outbreak, where it fits, but then lead to dramatically wrong predictions if extrapolated to later times, as seen in Fig. 2 B and C.
Yet, most epidemics are not left unchecked and close to their epidemic threshold, whether as they emerge or as we seek to eradicate them, superlinear infection rates could be used to construct good effective models. They capture the complex and heterogeneous dynamics of transmission in ways that simple contagion models cannot. Our recommendation, however, would be to i) limit those approaches to short-horizon forecasts and ii) use a short calibration window to continuously update the nonlinear infection rate as more data become available while minimizing the bias coming from older data. This comes as a silver lining as machine learning approaches, which by design create effective models of reality, are becoming an essential tool to provide epidemic forecasts (31, 38).
For mechanistic inference, our framework and the aforementioned studies (2, 18) highlight the inherent difficulties of this task as one needs to control for all other potential causes, be it an unobserved interaction with other dynamical processes, temporal patterns in contact networks, and heterogeneity in the transmission rate across settings. It can lead us to observe complex contagion mechanisms (16) or higher-order group interactions (13), but these are not necessarily intrinsic properties of the process. They may simply reflect a shortcoming of our modeling approach, whose assumptions and dimensionality can influence the shape of the dynamics (39). This can be problematic since many past efforts aim to measure nonlinear effects as evidence of social reinforcement or peer pressure (15, 40).
Consequently, future works should investigate more carefully the feasibility of distinguishing simple and complex contagion in more realistic scenarios, with an imperfect knowledge of the transmission in different settings. Beyond binary classification, efforts have been made to quantify the nonlinearity of contagions from real-world experiments (41). Since heterogeneous group transmission leads to a systematic superlinear bias, we encourage researchers to take this effect into account if relevant to their situation. As we gather evidence about the explanatory power of complex contagions, we must be careful and consider the subtle but important role heterogeneity can play in shaping the rate of infection.

Materials and Methods

Explicit Mapping to Simplicial Contagion.

In the simplicial contagion model (26), a d-simplex where all nodes are infectious except one infects the remaining node at rate βd, but also the node receives contributions from all lower-dimensional simplices included in the dsimplex. In ref. 19, it was shown to be equivalent to having a nonlinear infection rate λ¯n,ii at the level of groups. Indeed, interpreting a group of size n=3 as a simplex of dimension d=2, we would decompose the infection rate as
λ¯3,ii=(β2+2β1)δi,2+β1δi,1.
[14]
A similar expression can be obtained for higher dimensional simplex, but with a more complicated combinatorial expansion.

Stationary State.

The complete system described in Eq. 1b eventually settles to a stationary state in the limit t. The variables characterizing the stationary state are obtained by solving the following self-consistent expressions
Sm=qm1+mr
[15a]
(i+1)Gn,i+1λ={i+(ni)[λi+ρ]}Gn,iλ,(ni+1)[λ(i1)+ρ]Gn,i1λ,
[15b]
which are derived from Eq. 1 and where r and ρ are still obtained from Eq. 2.
Eq. 15b can be solved explicitly by noting that Gn,iλ must satisfy the simpler detailed balance condition. Indeed, all states i{0,1,n} for a group can placed on a line. At equilibrium, the flow of probability from i to i+1 must be equal to the flow of probability in the reverse direction (this can be proved by induction starting from either endpoint, i=0 or i=n). The detailed balance condition is
(ni)λi+ρ}Gn,iλ=(i+1)Gn,i+1λ,
[16]
with solution
Gn,iλ(ρ)=Gn,0λnij=0i1[λj+ρ]i{1,,n},
[17]
with Gn,0λ=pλ,ni=1nGn,iλ.
For the coarse-grained system, we obtain a form very similar to Eq. 17,
Gn,i(ρ)=Gn,0nij=0i1[λ¯n,i(ρ)j+ρ]i{1,,n},
[18]
where Gn,0(i)=pni=1nGn,i.

Stationary Effective Transmission Rate.

We exemplify three cases of increasingly heterogeneous transmission rate distributions: the Weibull, the lognormal, and the Fréchet distributions. While there are many other distributions we could investigate, the overall qualitative behavior of λ¯n,i should be covered by one of these cases.
To simplify the notation, we use pλ|n=ϕλ independent of n, which also implies λ¯n,iλ¯i. We use two positive real parameters μ,ν, a scale parameter and a shape parameter respectively. Larger values of ν imply a larger variance for the distribution. Since μ is a scale parameter, we will always have a critical effective transmission rate of the form
λ¯iμf(i;ν),
in the limit ρ0 with some function f(i;ν).

Weibull distribution.

Let us consider a Weibull distribution of the form
ϕλ=1μνλμ1/ν1expλμ1/ν.
[19]
The tail of this distribution is driven by the exponential term, which decreases slower with λ for larger ν.
In the limit ρ0, we have
λ¯i(0)=μΓνi+1]Γν(i1)+1].
This is illustrated in Fig. 3A. For large i, this implies
λ¯i(0)iν.
Therefore, Weibull distributed rates lead to a power-law effective transmission rate λ¯iiν. Note that for ν0, the distribution ϕλ is peaked, and we recover a constant rate.
It is worth mentioning that all distributions with an exponential tail produce similar power-law behavior. The exponential distribution is directly a subcase (ν=1), and it is easy to show that a gamma distribution would also produce an approximately power-law rate of infection.

Lognormal distribution.

Let us now consider a distribution with a tail that decreases slower than the Weibull, the lognormal distribution
ϕλ=1λ2πνexp(ln λln μν/2)22ν.
[20]
The tail of this distribution is driven by the exponential term again, but the exponential argument decreases with (lnλ)2, which is overall faster than a power-law, but slower than the Weibull.
In the limit ρ0, we have
λ¯i(0)expi(ln μ+ν/2)+i2ν/2exp(i1)(ln μ+ν/2)+(i1)2ν/2=μeνi.
This is illustrated in Fig. 3B. Therefore, lognormal distributed rates lead to an effective transmission rate that increases exponentially, λ¯i(0)eνi. Note that again for ν0, we recover a peaked distribution and the effective transmission rate is a constant.

Fréchet distribution.

Let us now use a Fréchet distribution (also known as inverse Weibull),
ϕλ=1μνλμ1/ν1expλμ1/ν.
[21]
The Fréchet distribution has a power-law tail, of the form λ1/ν1. This means that the moment of order i, E[λi], is undefined if i1/ν. We, therefore, restrict 0<ν<1 to have a well-defined average rate λ. Let us also introduce a cutoff value λmax=μϵν, where ϵ1.
Using a change of variable x=(λ/μ)1/ν, in the limit ρ0 we have
λ¯i(0)=μϵxνiexdxϵxν(i1)exdx=Γ(1νi,ϵ)Γ[1ν(i1),ϵ],
where we recognize a ratio of incomplete gamma functions, whose behavior for ϵ0 depends on i and ν.
If νi<1, then the limit ϵ0 is well defined and corresponds to
λ¯i(0)=μΓ1νiΓ1ν(i1).
If instead 1<νi<1+ν, we have
λ¯i(0)μϵ1νi(νi1)Γ1ν(i1),
which diverges like ϵ1νi. Finally, if ν(i1)>1, we have
λ¯i(0)μ[ν(i1)1]ϵννi1.
which diverges like ϵν, and for νi1, it is a constant independent from i. We have omitted the equality cases (i+1)ν=1 and iν=1, which are only intermediate limit behavior in between the three cases above.
Putting all these cases together, for small but non-zero ϵ, the critical effective transmission rate λ¯i(0) is a sigmoid function for i with a jump around i=1/ν, as illustrated in Fig. 3C.

Effective Rate Based on the Leading Eigenvector.

If we want to model accurately the beginning of an epidemic, a good approximation is obtained for the effective transmission rate by considering the leading eigenvector of the Jacobian matrix (the one associated with the eigenvalue of maximal real part) of the dynamical system near the critical point. Let us rewrite Eq. 1 as
dSmdt=Fm(S,G),dGn,iλdt=Fn,iλ(S,G),
where S=[S1,S2,,Smax], and G is formally an infinite dimensional vector where the elements are of the form Gn,iλ. This also means the Jacobian matrix is infinite-dimensional.
We linearize the dynamical system near the state Smqmm and Gn,iλpλ,nδi,0. To simplify the notation, all quantities in this section are evaluated at the critical point. First,
Fn,iλSm=0m,n,i,λ.
[22]
Indeed, the only term in Fn,iλ depending on Sm is ρ, but since r=0 at the critical point, the above expression holds. Therefore, we can ignore the S part of the Jacobian since it does not influence the G part of the Jacobian, which is the important one determining the effective transmission rate.
Second, from Eq. 1b, we can show that
Fn,iλGn,iλ=δ(λλ)δn,n(i+1)δi+1,iiδi,i+λ(i1)(ni+1)δi1,iλi(ni)δi,i+npλ,nρGn,iλδi1,0δi,0.
[23]
Let us define v as the G part of the leading eigenvector of the Jacobian matrix. It must therefore respect the eigenvector relation
0n,iFn,iλGn,iλvn,iλdλ=Λvn,iλ,
[24]
where Λ is the associated eigenvalue. Using Eq. 23, we obtain the simplified expression
Λvn,iλ=(i+1)vn,i+1λivn,iλ+λ(i1)(ni+1)vn,i1λλi(ni)vn,iλ+npλ,nψ[δi1,0δi,0],
[25]
where
ψm(m1)mn0n,iλi(ni)vn,iλdλ.
[26]
Note the similarity with Eq. 15b: At the critical point (Λ=0), they exactly match, which means vn,iλGn,iλ.
The simplest way to solve Eq. 25 in general is by using a power method. Note that Λ might not be the eigenvalue with the largest magnitude. For instance, let us assume there exists an eigenvalue Λ<0 such that |Λ|>|Λ|. Note that we restrain ourselves to real eigenvalues and eigenvectors by choosing a real starting eigenvector v(0) at random. We then solve for the leading eigenvector by considering the following iteration procedure
v(j+1)=Mv(j)Mv(j),
[27]
where M1+ξJ, J is the Jacobian matrix restrained to the v part, and ξ is a parameter that can be tuned. The matrix M has the same eigenvectors as J, but its eigenvalues are shifted and rescaled. Therefore, by choosing ξ sufficiently small, we can ensure that the procedure converges on the leading eigenvector.

SIR Model.

Let us now assume that infectious individuals who recover are removed from the pool of susceptible (for instance, they could be immune to the disease), leading to a Susceptible-Infectious-Removed (SIR) model. To describe this model, we can consider that n is no longer fixed and characterize the sum of infectious and susceptible nodes in a group, which we name the effective size of a group. Therefore, when an infectious node recovers, the effective size is reduced by one, nn1. This requires little change to our approximate master equations for the complete system:
dImdt=Im+mrSm,
[28a]
dSmdt=mrSm,
[28b]
dGn,iλdt=(i+1)Gn+1,i+1λiGn,iλ+(ni+1){λ(i1)+ρ}Gn,i1λ(ni){λi+ρ}Gn,iλ.
[28c]
Indeed, we simply remove the term qmSm in the second equation because there is no positive input of susceptible individuals anymore, and we change the first term in the third one (i+1)Gn,i+1λ(i+1)Gn+1,i+1λ to account for the reduction of the effective size of a group when infectious nodes recover. The fraction of nodes that are infectious and of membership m, Im, is no longer qmSm because of the nodes that are removed, so we included it in the system of differential equations. We can calculate the number of removed nodes as
R(t)=1I(t)S(t)=1m[Im(t)+Sm(t)].
[29]
Similarly, only the first term on the right-hand side changes for the coarse-grained system
dGn,idt=(i+1)Gn+1,i+1iGn,i+(ni+1){λ¯n,i1(i1)+ρ}Gn,i1(ni){λ¯n,ii+ρ}Gn,i.
[30]
Since there is no stationary state for the SIR model, we can only rely on the leading eigenvector of the Jacobian matrix to approximate the effective transmission rate λ¯n,i1. The eigenvectors for the complete SIR model respect a very similar self-consistent relationship, namely
Λvn,iλ=(i+1)vn+1,i+1λivn,iλ+λ(i1)(ni+1)vn,i1λλi(ni)vn,iλ+npλ,nψ[δi1,0δi,0],
[31]
with the only change being (i+1)vn,i+1λ(i+1)vn+1,i+1λ on the right-hand side.

Likelihood for Statistical Inference.

The type of models we consider are continuous-time and time-homogeneous Markov processes. To infer the parameters (β,α) of a nonlinear contagion model, we evaluate the likelihood
P(Y|β,α)=j=0M1γtjexpγtj(tj+1tj)P(ytj+1|ytj;β,α),
[32]
where M is the number of state transitions, tjj=1M correspond to the time of these transitions, γtjγtj(β,α) is the total rate of transition out of the state ytj, and P(ytj+1|ytj;β,α) gives the probability that the next state after ytj is ytj+1. We compute γtj by summing the rate of all possible recovery and infection events, namely
γtj=NI+gG(ngig)βigα,
[33]
where NI is the number of infectious nodes, G is the set of all groups and ng (ig) is the size (number of infectious) of group g. Assuming ytjytj+1 is a recovery event, then P(ytj+1|ytj;β,α) is simply γtj1, while if it is an infection event—node k got infected—then
P(ytj+1|ytj;β,α)=γtj1gGkβigα,
[34]
where GkG is the subset of groups to which node k belongs.

Real Weighted Hypergraphs.

In Fig. 5A, we use coauthorship data from DBLP (33). It consists of a list of publications (groups) and authors (nodes belonging to groups), which naturally takes the form of a hypergraph. The original dataset contains 1,831,127 nodes and 2,954,518 groups; to perform stochastic simulations, we used a subhypergraph obtained from a breadth-first search. We started from a random group and visited all groups at a maximum distance of 3. The resulting subhypergraph contains 116,700 and 136,108 groups.
In Fig. 5B, we use high-school contact patterns originating from the SocioPatterns research collaboration (34). We use the version available on XGI-DATA (42), processed by ref. 33. Wearable sensors detect pairwise interaction between people at a resolution of 20 s. Maximal cliques of interacting individuals are then promoted to higher-order group interactions. Because these are timestamped group interactions, one could construct a temporal hypergraph. Instead, we associate a weight to each unique group interaction, corresponding to the number of times it appears in the dataset. The result is a weighted hypergraph.
In Fig. 5C, we use email exchanges within a large European research institution (33, 35, 36). We use the version available on XGI-DATA (42). It consists of communication between institution members, and all individuals involved in an email are associated to a group interaction. Again, these are timestamped group interactions, but instead, we associate a weight to each unique group interaction corresponding to the number of times it appears in the dataset, resulting in a weighted hypergraph.
See SI Appendix for more information on the hypergraphs properties.

Data, Materials, and Software Availability

Code and network data have been deposited in Zenodo (43). All other data are included in the manuscript and/or SI Appendix.

Acknowledgments

L.H.-D. acknowledges financial support from the NIH 1P20 GM125498-01 Centers of Biomedical Research Excellence Award. A.A. acknowledges financial support from the Sentinelle Nord initiative of the Canada First Research Excellence Fund and from the Natural Sciences and Engineering Research Council of Canada (project 2019-05183). G.S.-O. acknowledges financial support from the Fonds de recherche du Québec - Nature et technologies (project 313475) and support from the Cooperative Agreement no. NU38OT000297 from the Council of State and Territorial Epidemiologists. The findings and conclusions in this study are those of the authors and do not necessarily represent the official position of the funding agencies.

Author contributions

G.S.-O., L.H.-D., and A.A. designed research; G.S.-O. performed research; G.S.-O. contributed new reagents/analytic tools; G.S.-O., L.H.-D., and A.A. analyzed data; and G.S.-O., L.H.-D., and A.A. wrote the paper.

Competing interests

The authors declare no competing interest.

Supporting Information

Appendix 01 (PDF)

References

1
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Epidemic processes in complex networks. Rev. Mod. Phys. 87, 925–979 (2015).
2
G. St-Onge, H. Sun, A. Allard, L. Hébert-Dufresne, G. Bianconi, Universal nonlinear infection kernel from heterogeneous exposure on higher-order networks. Phys. Rev. Lett. 127, 158301 (2021).
3
J. G. Allen, A. M. Ibrahim, Indoor air changes and potential implications for SARS-CoV-2 transmission. JAMA 325, 2112–2113 (2021).
4
J. M. Robles-Romero, G. Conde-Guillén, J. C. Safont-Montes, F. M. García-Padilla, M. Romero-Martín, Behaviour of aerosols and their role in the transmission of SARS-CoV-2; a scoping review. Rev. Med. Virol. 32, e2297 (2022).
5
T. P. Weber, N. I. Stilianakis, Inactivation of influenza A viruses in the environment and modes of transmission: A critical review. J. Infect. 57, 361–373 (2008).
6
H. W. Hethcote, J. A. Yorke, Gonorrhea Transmission Dynamics and Control (Springer, Heidelberg, 1984).
7
S. T. Leu et al., Sex, synchrony, and skin contact: Integrating multiple behaviors to assess pathogen transmission risk. Behav. Ecol. 31, 651–660 (2020).
8
N. O. Hodas, K. Lerman, The simple rules of social contagion. Sci. Rep. 4, 4343 (2014).
9
A. Pentland, Honest Signals: How They Shape Our World (MIT Press, 2010).
10
F. Walter, H. Bruch, The positive group affect spiral: A dynamic model of the emergence of positive affective similarity in work groups. J. Organ. Behav. 29, 239–261 (2008).
11
G. Burgio, S. Gómez, A. Arenas, Spreading dynamics in networks under context-dependent behavior. Phys. Rev. E 107, 064304 (2023).
12
G. St-Onge, V. Thibeault, A. Allard, L. J. Dubé, L. Hébert-Dufresne, Master equation analysis of mesoscopic localization in contagion dynamics on higher-order networks. Phys. Rev. E 103, 032301 (2021).
13
F. Battiston et al., Networks beyond pairwise interactions: Structure and dynamics. Phys. Rep. 874, 1–92 (2020).
14
L. Hébert-Dufresne, P. A. Noël, V. Marceau, A. Allard, L. J. Dubé, Propagation dynamics on networks featuring complex topologies. Phys. Rev. E 82, 036115 (2010).
15
B. Mønsted, P. Sapieżyński, E. Ferrara, S. Lehmann, Evidence of complex contagion of information in social media: An experiment using Twitter bots. PLoS One 12, e0184148 (2017).
16
S. Lehmann, Y. Y. Ahn, Eds., Complex Spreading Phenomena in Social Systems, Computational Social Sciences (Springer, 2018).
17
Wm. Liu, H. W. Hethcote, S. A. Levin, Dynamical behavior of epidemiological models with nonlinear incidence rates. J. Math. Biol. 25, 359–380 (1987).
18
L. Hébert-Dufresne, S. V. Scarpino, J. G. Young, Macroscopic patterns of interacting contagions are indistinguishable from social reinforcement. Nat. Phys. 16, 426–431 (2020).
19
G. St-Onge et al., Influential groups for seeding and sustaining nonlinear contagion in heterogeneous hypergraphs. Commun. Phys. 5, 25 (2022).
20
N. W. Landry, J. G. Restrepo, The effect of heterogeneity on hypergraph contagion models. Chaos 30, 103117 (2020).
21
G. F. de Arruda, G. Petri, Y. Moreno, Social contagion models on hypergraphs. Phys. Rev. Res. 2, 023032 (2020).
22
J. T. Matamalas, S. Gómez, A. Arenas, Abrupt phase transition of epidemic spreading in simplicial complexes. Phys. Rev. Res. 2, 012049 (2020).
23
M. Granovetter, Threshold models of collective behavior. Am. J. Sociol. 83, 1420–1443 (1978).
24
D. Centola, M. Macy, Complex contagions and the weakness of long ties. Am. J. Sociol. 113, 702–734 (2007).
25
Á. Bodó, G. Y. Katona, P. L. Simon, SIS epidemic propagation on hypergraphs. Bull. Math. Biol. 78, 713–735 (2016).
26
I. Iacopini, G. Petri, A. Barrat, V. Latora, Simplicial models of social contagion. Nat. Commun. 10, 2485 (2019).
27
B. Jhun, M. Jo, B. Kahng, Simplicial SIS model in scale-free uniform hypergraph. J. Stat. Mech. 2019, 123207 (2019).
28
G. Ferraz, G. de Arruda, Y Moreno Petri, Social contagion models on hypergraphs. Phys. Rev. Res. 2, 023032 (2020).
29
G. Burgio, A. Arenas, S. Gómez, J. T. Matamalas, Network clique cover approximation to analyze complex contagions through group interactions. Commun. Phys. 4, 1–10 (2021).
30
D. M. Abrams, S. H. Strogatz, Modelling the dynamics of language death. Nature 424, 900 (2003).
31
C. Murphy, E. Laurence, A. Allard, Deep learning of contagion dynamics on complex networks. Nat. Commun. 12, 4720 (2021).
32
G. Cencetti, D. A. Contreras, M. Mancastroppa, A. Barrat, Distinguishing simple and complex contagion processes on networks. Phys. Rev. Lett. 130, 247401 (2023).
33
A. R. Benson, R. Abebe, M. T. Schaub, A. Jadbabaie, J. Kleinberg, Simplicial closure and higher-order link prediction. Proc. Natl. Acad. Sci. U.S.A. 115, E11221–E11230 (2018).
34
R. Mastrandrea, J. Fournet, A. Barrat, Contact patterns in a high school: A comparison between data collected using wearable sensors, contact diaries and friendship surveys. PLoS One 10, 1–26 (2015).
35
J. Leskovec, J. Kleinberg, C. Faloutsos, Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1, 2-es (2007).
36
H. Yin, A. R. Benson, J. Leskovec, D. F. Gleich, “Local higher-order graph clustering” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017 (Association for Computing Machinery, New York, NY, USA, 2017), pp. 555–564.
37
P. S. Dodds, D. J. Watts, Universal behavior in a generalized model of contagion. Phys. Rev. Lett. 92, 218701 (2004).
38
B. Klein et al., Forecasting hospital-level COVID-19 admissions using real-time mobility data. Commun. Med. 3, 25 (2023).
39
V. Thibeault, A. Allard, P. Desrosiers, The low-rank hypothesis of complex systems. arXiv [Preprint] (2022). http://arxiv.org/abs/2208.04848.
40
L. Weng, F. Menczer, Y. Y. Ahn, Virality prediction and community structure in social networks. Sci. Rep. 3, 2522 (2013).
41
J. Lee, D. Lazer, C. Riedl, Complex contagion in viral marketing: Causal evidence and embeddedness effects from a country-scale field experiment (Northeastern U. D’Amore-McKim School of Business Research Paper No. 409205, 2022).
42
N. W. Landry et al., XGI: A Python package for higher-order interaction networks. J. Open Source Softw. 8, 5162 (2023).
43
G. St-Onge, gstonge/heterogeneous-transmission. Zenodo. https://doi.org/10.5281/zenodo.7679204. Deposited 26 February 2023.

Information & Authors

Information

Published in

The cover image for PNAS Vol.121; No.1
Proceedings of the National Academy of Sciences
Vol. 121 | No. 1
January 2, 2024
PubMed: 38154065

Classifications

Data, Materials, and Software Availability

Code and network data have been deposited in Zenodo (43). All other data are included in the manuscript and/or SI Appendix.

Submission history

Received: July 18, 2023
Accepted: November 24, 2023
Published online: December 28, 2023
Published in issue: January 2, 2024

Keywords

  1. complex contagions
  2. higher-order networks
  3. mechanistic inference

Acknowledgments

L.H.-D. acknowledges financial support from the NIH 1P20 GM125498-01 Centers of Biomedical Research Excellence Award. A.A. acknowledges financial support from the Sentinelle Nord initiative of the Canada First Research Excellence Fund and from the Natural Sciences and Engineering Research Council of Canada (project 2019-05183). G.S.-O. acknowledges financial support from the Fonds de recherche du Québec - Nature et technologies (project 313475) and support from the Cooperative Agreement no. NU38OT000297 from the Council of State and Territorial Epidemiologists. The findings and conclusions in this study are those of the authors and do not necessarily represent the official position of the funding agencies.
Author contributions
G.S.-O., L.H.-D., and A.A. designed research; G.S.-O. performed research; G.S.-O. contributed new reagents/analytic tools; G.S.-O., L.H.-D., and A.A. analyzed data; and G.S.-O., L.H.-D., and A.A. wrote the paper.
Competing interests
The authors declare no competing interest.

Notes

This article is a PNAS Direct Submission.

Authors

Affiliations

Laboratory for the Modeling of Biological and Socio-Technical Systems, Northeastern University, Boston, MA 02115
Laurent Hébert-Dufresne https://orcid.org/0000-0002-0008-3673
Vermont Complex Systems Center, University of Vermont, Burlington, VT 05401
Department of Computer Science, University of Vermont, Burlington, VT 05401
Département de physique, de génie physique et d’optique, Université Laval, Québec, QC G1V 0A6, Canada
Vermont Complex Systems Center, University of Vermont, Burlington, VT 05401
Département de physique, de génie physique et d’optique, Université Laval, Québec, QC G1V 0A6, Canada
Centre interdisciplinaire en modélisation mathématique, Université Laval, Québec, QC G1V 0A6, Canada

Notes

1
To whom correspondence may be addressed. Email: [email protected].

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Altmetrics




Citations

Export the article citation data by selecting a format from the list below and clicking Export.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to access the full text.

    Single Article Purchase

    Nonlinear bias toward complex contagion in uncertain transmission settings
    Proceedings of the National Academy of Sciences
    • Vol. 121
    • No. 1

    Figures

    Tables

    Media

    Share

    Share

    Share article link

    Share on social media