Local structure can identify and quantify influential global spreaders in large scale social networks
- aSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China;
- bSchool of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China;
- cKey Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China;
- dComputing Science, Institute of High Performance Computing, Agency for Science, Technology, and Research, Singapore 138632;
- eDepartment of Physics, National University of Singapore, Singapore 117551;
- fCenter for Polymer Studies and Department of Physics, Boston University, Boston, MA 02215;
- gDepartment of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel
See allHide authors and affiliations
Contributed by H. Eugene Stanley, December 31, 2017 (sent for review August 31, 2017; reviewed by Marc Barthelemy and Zoltán Toroczkai)

Significance
Identification and quantification of influential spreaders in social networks are challenging due to the gigantic network sizes and limited availability of the entire structure. Here we show that such difficulty can be overcome by reducing the problem scale to a local one, which is essentially independent of the entire network. This is because in viral spreading the characteristic spreading size does not depend on network structure outside the local environment of the seed spreaders. Our approach may open the door to solve various big data problems such as false information surveillance and control, viral marketing, epidemic control, and network protection.
Abstract
Measuring and optimizing the influence of nodes in big-data online social networks are important for many practical applications, such as the viral marketing and the adoption of new products. As the viral spreading on a social network is a global process, it is commonly believed that measuring the influence of nodes inevitably requires the knowledge of the entire network. Using percolation theory, we show that the spreading process displays a nucleation behavior: Once a piece of information spreads from the seeds to more than a small characteristic number of nodes, it reaches a point of no return and will quickly reach the percolation cluster, regardless of the entire network structure; otherwise the spreading will be contained locally. Thus, we find that, without the knowledge of the entire network, any node’s global influence can be accurately measured using this characteristic number, which is independent of the network size. This motivates an efficient algorithm with constant time complexity on the long-standing problem of best seed spreaders selection, with performance remarkably close to the true optimum.
Footnotes
↵1Y.H., S.J., Y.J., L.F., H.E.S., and S.H. contributed equally to this work.
- ↵2To whom correspondence may be addressed. Email: huyanq{at}mail.sysu.edu.cn or hes{at}bu.edu.
Author contributions: Y.H., L.F., and S.H. designed research; Y.H., S.J., and Y.J. performed research; Y.H., S.J., H.E.S., and S.H. analyzed data; and Y.H., L.F., H.E.S., and S.H. wrote the paper.
Reviewers: M.B., Centre Commissariat à l’Energie Atomique; and Z.T., University of Notre Dame.
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1710547115/-/DCSupplemental.
Published under the PNAS license.
Citation Manager Formats
Article Classifications
- Physical Sciences
- Applied Physical Sciences
- Social Sciences
- Social Sciences