Anatomy of news consumption on Facebook
See allHide authors and affiliations
Edited by Susan T. Fiske, Princeton University, Princeton, NJ, and approved January 31, 2017 (received for review October 14, 2016)

Significance
Social media heavily changed the way we get informed and shape our opinions. Users’ polarization seems to dominate news consumption on Facebook. Through a massive analysis on 920 news outlets and 376 million users, we explore the anatomy of news consumption on Facebook on a global scale. We show that users tend to confine their attention on a limited set of pages, thus determining a sharp community structure among news outlets. Furthermore, our findings suggest that users have a more cosmopolitan perspective of the information space than news providers. We conclude with a simple model of selective exposure that well reproduces the observed connectivity patterns.
Abstract
The advent of social media and microblogging platforms has radically changed the way we consume information and form opinions. In this paper, we explore the anatomy of the information space on Facebook by characterizing on a global scale the news consumption patterns of 376 million users over a time span of 6 y (January 2010 to December 2015). We find that users tend to focus on a limited set of pages, producing a sharp community structure among news outlets. We also find that the preferences of users and news providers differ. By tracking how Facebook pages “like” each other and examining their geolocation, we find that news providers are more geographically confined than users. We devise a simple model of selective exposure that reproduces the observed connectivity patterns.
Alarge body of research has addressed news consumption on online social media and its polarizing effect on public opinion (1⇓⇓⇓–5). Social media and microblogging platforms have changed the way we access information and form opinions. Communication has become increasingly personalized, both in the way messages are framed and how they are shared across social networks. Furthermore, according to a recent study (6), ∼63% of users acquire their news from social media, and these news are subject to the same popularity dynamics as other forms of content. Recent works (7) provide empirical evidence of the pivotal role of confirmation bias and selective exposure in online social dynamics. Users, indeed, tend to focus on specific narratives and join polarized groups (i.e., echo chambers) (8⇓–10), where they end up reinforcing their worldview [even if pieces of content are deliberately false (11, 12)] and dismissing contradictory information (13). Discussion and elaboration of narratives in such a segregated environment elicits group polarization and negatively influences user emotion (14⇓⇓–17). Therefore, in this paper, to better understand how echo chambers emerge, we explore the anatomy of news consumption on Facebook. We focus on how Facebook posts from news outlets are consumed and how user activity causes connectivity patterns to emerge. We analyze the interaction of 376 million users with all of the anglophone news outlets on Facebook listed in the European Media Monitor (18) over a 6-y time span, from January 2010 to December 2015. Using quantitative analysis, we find evidence that selective exposure plays a pivotal role in shaping news consumption online. Users tend to focus on a very limited set of pages and thus create a distinct community structure within these news outlets. We also find that the perspectives of the news outlets and the users differ. Our findings suggest that users have a more cosmopolitan perspective of the information space than news providers. Examining how pages “like” each other and taking into account their geolocation, we find geographically confined connectivity patterns. We conclude by devising a simple model of selective exposure that reproduces the observed connectivity patterns. We first analyze user behavior with respect to the information sources. We then analyze how user activity spans across these sources. Finally, we introduce a simple model that reproduces the observed dynamics. Our findings suggest that probably the main driver of misinformation diffusion is the polarization of users on specific narratives rather than the lack of fact-checked certifications.
Results and Discussion
Users’ Attention.
News items on Facebook appear in posts that can be liked, commented, or shared by users. A like is usually a positive feedback on a news item. A share indicates a desire to spread a news item to friends. A comment can have multiple features and meanings and can generate collective debate. The likes, shares, and comments on Facebook posts present a heavy-tailed distribution (SI Appendix, 2. Attention Pattern). The lifetime of a post is the time period between the first and the last comment, and it presents a peak at 24 h. User activity is heterogeneous and the number of likes and comments ranges from very few (the majority) to hyperactivity. The Complementary Cumulative Distribution Function of the number of likes and comments for single users exhibits heavy tails (SI Appendix). The overall number of likes of each user is a good proxy for their engagement with Facebook news pages and the lifetime of each user can be approximated by the length of time between the date of their first comment and their last comment. These measures could provide important insights about news consumption. Our goal is to quantify the turnover of Facebook news sources by measuring the heterogeneity of user activity, and thus we measure the total number of pages a user interacts with. Fig. 1 shows the number of news sources a user interacts with for the lifetime (i.e., the distance in time between the first and last interaction with a post) and for increasing levels of engagement (i.e., the total number of likes). For a comparative analysis, we standardized between 0 and 1 both lifetime and engagement over the entire user set. Fig. 1 shows the results for the yearly time window (first column) and for the weekly (second column) and monthly (third column) rates. Note that a user usually interacts with a small number of news outlets and that higher levels of activity and longer lifetime correspond to a smaller number of sources. There is a natural tendency of the users to confine their activity on a limited set of pages. According to our findings, news consumption on Facebook is dominated by selective exposure.
Users’ attention patterns. (Top) Maximum number of unique news sources that users with increasing levels of standardized lifetime interacted with monthly, weekly, and yearly. (Bottom) Maximum number of unique news outlets which users with increasing levels of standardized activity interacted with monthly, weekly, and yearly.
Clusters and Users’ Polarization.
User tendency to interact with few news sources might elicit page clusters. To test this hypothesis, we first characterize the emergent community structure of pages according to the users’ activity. We project the user page likes to derive the weighted graph
Community structure. (Left) Backbone of the projections on pages of the users’ likes (
Users Polarization. (Left) Activity of users across the five largest communities. (Right) Null model where users’ activity is randomly distributed. Vertices of the pentagon represent the five largest communities and the central point all of the remaining ones. The position of each dot is determined by the number of communities the users interacts with. The size and transparency indicate the number of users in that position.
Users’ and News Outlets’ Perspectives.
Facebook pages can like each other. We use this pattern of favorite pages to create a graph of page preferences. In news outlets, these preferences can be used to compare the perspectives of users and news providers. We use the bipartite projection of the pages that users like
Pages and users’ communities and locations. Backbone of the projections on pages of the user likes reduced to the pages that appear in
The Model.
Users on Facebook tend to focus on a limited set of news sources, on a macro scale. This mechanism of selective exposure generates a clustered and polarized structure. The community structure that emerges when analyzing the likes among the pages is different from the community structure defined by the users’ interaction with the pages. In this section, we provide a simple model of users’ preferential attachment to specific sources that reproduces the observed community structure.
The entities of our model are pages
Analysis of the synthetic pages-to-pages graphs
Conclusions
Using quantitative analysis, we show that the more active a user is, the more the user tends to focus on a small number of news sources. Looking at the page clusters generated by user activity, we find a distinct community structure and strong user polarization. We provide evidence that preferences of users and news outlets differ in that communities established by page creators are more locally confined than communities identified by the users’ activity, which can span across international borders. This segregation in distinct communities can be reproduced by a simple model that mimics the selective exposure of users. Content consumption on Facebook is strongly affected by the tendency of users to limit their exposure to a few sites. Despite the wide availability of content and heterogeneous narratives, there is major segregation and growing polarization in online news consumption. News undergoes the same popularity dynamics as popular videos of kittens or selfies. The spreading of fake news and unsubstantiated rumors motivated major corporations like Google and Facebook to provide solutions to the problem. Google news decided to flag fact-checked information and to penalize providers of fake news; others are proposing to use black lists of sources to automatically limit their spread. However, often debates, especially on socially relevant issues, are based upon conflicting narratives. Probably, the main problem behind misinformation is polarization of users online.
Methods
Ethics Statement.
The entire data collection process is performed exclusively by means of the Facebook Graph application programming interface (22), which is publicly available. We used only publicly available data (users with privacy restrictions are not included in our dataset). We abided by the terms, conditions, and privacy policies of Facebook. We did not seek ethical approval because the data were preexisting.
Data Collection.
The European Media Monitor provides a list of all news sources. We limit our collection to all those pages reporting in English. The downloaded data from each page includes all of the posts made from January 1, 2010 to December 31, 2015, as well as all of the likes and comments on those posts. The European Media Monitor list includes the country and the region of each news source. For accurate mapping on the globe, we also collected the geographical location—latitude and longitude—of each page. A breakdown of data and the set of pages used in the analysis is provided in SI Appendix, Table S1.
Definitions.
In this section, we provide a brief description of the main concepts and tools used in the analysis.
Projection of Bipartite Graphs.
A bipartite graph is a triple
We consider bipartite networks in which the two disjointed sets of nodes are users and Facebook pages. Edges represent interactions among users and pages. As an example, a like to a given piece of information posted by a page constitutes a link between the user and the page and
Community Detection Algorithm.
A community detection algorithm is used to identify groups of nodes in a network. The strategy relies on the modularity that quantifies the division of a network into separated clusters, and a high modularity corresponds to a dense connectivity between nodes in a community and sparse connections between modules. We use four community-detection algorithms. (i) The fast greedy (FG) algorithm measures the maximum modularity by considering all possible community structures in the network. Every vertex initially belongs to a separate community, and communities are merged iteratively such that each merge is locally optimal (i.e., it yields the largest increase in the current value of modularity). The algorithm stops when it is no longer possible to further increase modularity (23). (ii) The walk trap (WT) algorithm exploits the fact that a random walker tends to become trapped in the denser parts of a graph (i.e., in communities). Hence, WT uses short random walks to merge separate communities (24). (iii) The multilevel (ML) algorithm uses a multilevel modularity optimization procedure. Each vertex is initially assigned to a community. At each step vertices are then reassigned to communities and nodes move to the community in which they have the highest modularity (25). (iv) The label propagation (LP) algorithm (26) gives a unique label to each vertex, which is then updated according to majority voting in the neighboring vertices. Dense node groups quickly reach a consensus on a common label. Finally, to compare the various community structures, we use standard methods that compare the similarity between different clustering methods and consider how nodes are assigned in each community detection algorithm (19, 20).
Backbone Detection Algorithm.
The disparity filter algorithm is a network reduction technique that identifies the backbone structure of a weighted network without destroying its multiscale nature (27). We use this algorithm to determine the connections that form the backbones of our networks and to produce clear visualizations.
Acknowledgments
The authors thank Sandro Forgione and Pandoros. This work was funded by EU Future and Emerging Technologies (FET) MULTIPLEX Project 317532, FET SIMPOL Project 610704, FET DOLFINS Project 640772), SoBigData Project 654024, and Center of Excellence for Global Systems Science Project 676547. The funders had no role in the study design, data collection and analysis, decision to publish, or manuscript preparation.
Footnotes
↵1Present address: Dipartimento di Scienze Ambientali, Informatica, e Statistica (DAIS), University of Venice, 30172 Venice, Italy.
- ↵2To whom correspondence should be addressed. Email: walterquattrociocchi{at}gmail.com.
Author contributions: A.L.S., A.S., and W.Q. designed research; A.L.S., F.Z., M.D.V., H.E.S., and W.Q. performed research; F.Z., M.D.V., and A.B. contributed new reagents/analytic tools; A.L.S., F.Z., M.D.V., A.S., G.C., H.E.S., and W.Q. analyzed data; and A.L.S., A.S., H.E.S., and W.Q. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1617052114/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵
- ↵
- ↵.
- Coleman S,
- Moss G,
- Parry K
- Bennett WL
- ↵.
- Ksiazek TB,
- Malthouse EC,
- Webster JG
- ↵.
- Iyengar S,
- Hahn KS
- ↵.
- Newman N,
- Levy DAL,
- Nielsen RK
- ↵.
- Quattrociocchi W,
- Scala A,
- Sunstein CR
- ↵
- ↵.
- Bessi A, et al.
- ↵.
- Del Vicario M, et al.
- ↵.
- Mocanu D,
- Rossi L,
- Zhang Q,
- Karsai M,
- Quattrociocchi W
- ↵
- ↵.
- Zollo F, et al.
- ↵
- ↵
- ↵.
- Yardi S,
- Boyd D
- ↵.
- Bakshy E,
- Messing S,
- Adamic LA
- ↵.
- Steinberger R,
- Pouliquen B,
- Van Der Goot E
- ↵
- ↵
- ↵.
- Deffuant G,
- Neau D,
- Amblard F,
- Weisbuch G
- ↵Facebook (August 2013) Using the graph API (Facebook, Menlo Park, CA) . Available at https://developers.facebook.com/docs/graph-api/using-graph-api. Accessed January 19, 2014..
- ↵.
- Clauset A,
- Newman MEJ,
- Moore C
- ↵
- ↵
- ↵.
- Raghavan UN,
- Albert R,
- Kumara S
- ↵.
- Serrano MA,
- Boguna M,
- Vespignani A