New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Subnets of scalefree networks are not scalefree: Sampling properties of networks

Contributed by Robert M. May, February 11, 2005
Abstract
Most studies of networks have only looked at small subsets of the true network. Here, we discuss the sampling properties of a network's degree distribution under the most parsimonious sampling scheme. Only if the degree distributions of the network and randomly sampled subnets belong to the same family of probability distributions is it possible to extrapolate from subnet data to properties of the global network. We show that this condition is indeed satisfied for some important classes of networks, notably classical random graphs and exponential random graphs. For scalefree degree distributions, however, this is not the case. Thus, inferences about the scalefree nature of a network may have to be treated with some caution. The work presented here has important implications for the analysis of molecular networks as well as for graph theory and the theory of networks in general.
Footnotes

↵ ‡ To whom correspondence should be addressed. Email: m.stumpf{at}imperial.ac.uk.

Author contributions: M.P.H.S., C.W., and R.M.M. designed research, performed research, and wrote the paper.

Abbreviation: PGF, probabilitygenerating function.

↵ ∥ For the subnet, we have the PGF Summing first over k (0 ≤ k ≤ i), and remembering P(0) = 0 (for scalefree networks), we get Note that G*(1) = ΣP(i) = 1, as it should.
↵ ∥ The subsequent sample will, however, contain orphan nodes, given by . If we redefine P*(0) ≡ 0 by discarding such orphans, we have the subnet defined by Eq. 5, where the renormalization constant C is required to compensate for the deletion of the orphan nodes: C[1 – P*(0)] = 1 or

↵ †† The negative binomial distribution has the PGF G(s) = [1 + (m/k)(1 – s)]^{–k}, where m represents the distribution`s mean value and k characterizes the distribution's “clumpiness” (the variance is given by σ^{2}/m ^{2} = 1/m + 1/k). This widely studied distribution includes the Poisson distribution (the degree distribution of classical random graphs) as the special case k → ∞ and the exponential or geometric as k = 1. The subnet PGF is obtained, via Eq. 4, by substituting 1 – p(1 – s) for s in G(s), to get G*(s) = [1 + (mp/k)(1 – s)]^{–k}. Thus, the subnet has an identical PGF to the full distribution, excepting only that the mean is reduced to mp (the clumping parameter k is unaltered). The proof for the binomial distribution is even more trivial.

↵ ‡‡ For Eq. 1 with γ = 2, the PGF of the subnet is , with C given by . Defining u = 1 – p + ps, whence dG*(s)/ds = pdG*/du, we have the degree distribution given by k!P*(k) = p^{k} (d^{k}G*(u)/du^{k} )_{u = 1 – p}. For small p « 1, consider first . This exact result gives P*(1) = –[Cpln(p)]/(1 – p). Further differentiation gives exact, but increasingly complicated, expressions for P*(k > 1). Thus, P*(2) = [Cp/(1 – p)][1 + pln(p)/(1 – p)]. For larger k, P*(k > 2) = [Cp/k(k – 1)][1 + O(p)]. Finally, we can calculate .

↵ §§ By method analogous to those in the previous note, we can obtain, for small p, the analytic results:
 Copyright © 2005, The National Academy of Sciences