Edited by: Taha Yasseri, University of Oxford, UK
Reviewed by: Benjamin Miranda Tabak, Universidade Católica de Brasília, Brazil; Sandro Meloni, University of Zaragoza, Spain
*Correspondence: Elisa Omodei, Departament d'Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Avda. Paisos Catalans, 26, 43007 Tarragona, Spain
This article was submitted to Interdisciplinary Physics, a section of the journal Frontiers in Physics
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Nowadays, millions of people interact on a daily basis on online social media like Facebook and Twitter, where they share and discuss information about a wide variety of topics. In this paper, we focus on a specific online social network, Twitter, and we analyze multiple datasets each one consisting of individuals' online activity before, during and after an exceptional event in terms of volume of the communications registered. We consider important events that occurred in different arenas that range from policy to culture or science. For each dataset, the users' online activities are modeled by a multilayer network in which each layer conveys a different kind of interaction, specifically: retweeting, mentioning and replying. This representation allows us to unveil that these distinct types of interaction produce networks with different statistical properties, in particular concerning the degree distribution and the clustering structure. These results suggests that models of online activity cannot discard the information carried by this multilayer representation of the system, and should account for the different processes generated by the different kinds of interactions. Secondly, our analysis unveils the presence of statistical regularities among the different events, suggesting that the non-trivial topological patterns that we observe may represent universal features of the social dynamics on online social networks during exceptional events.
The advent of online social platforms and their usage in the last decade, with exponential increasing trend, made possible the analysis of human behavior with an unprecedented volume of data. To a certain extent, online interactions represent a good proxy for social interactions and, as a consequence, the possibility to track the activity of individuals in online social networks allows one to investigate human social dynamics [
More specifically, in the last years an increasing number of researchers focused on individual's activity in Twitter, a popular microblogging social platform with about 302 millions active users posting, daily, more than 500 millions messages (i.e.,
The analysis of Twitter revealed that online social networks exhibit many features typical of social systems, with strongly clustered individuals within a scale-free topology [
Twitter allows users to communicate through small messages, using three different actions, namely mentioning, replying and retweeting. While some evidences have shown that users tend to exploit in different ways the actions made available by the Twitter platform [
In our framework, an exceptional event is a circumstance not likely in everyday news, limited to a short amount of time—typically ranging from hours to a few days—that causes an exceptional volume of tweets, allowing to perform a significant statistical analysis of social dynamics. It is worth mentioning that fluctuations in the number of tweets, mentions, retweets, and replies among users may vary from tens up to thousands in a few minutes, depending on the event. A typical example of exceptional event is provided by the discovery of the Higgs boson in July 2012 [
We use empirical data collected during six exceptional events of different type, to shed light on individual dynamics in the online social network. We use social network analysis to quantify the differences between mentioning, replying and retweeting in Twitter and, intriguingly, our findings reveal universal features of such activities during exceptional events.
It has been recently shown that the choice of how to gather Twitter data may significantly affect the results. In fact, data obtained from a simple backward search tend to over-represents more central users, not offering an accurate picture of peripheral activity, with more relevant bias for the network of mentions [
We consider different exceptional events because of their importance in different subjects, from politics to sport. More specifically, we focus on the Cannes Film Festival in 2013
For each event, we collected tweets sent between a starting time
Cannes2013 | 06 May 2013 | 03 Jun 2013 | cannes film festival,cannes, canneslive |
05:23:49 GMT | 03:48:26 GMT | #cannes2013, #festivalcannes, #palmdor | |
HiggsDiscovery2012 | 30 Jun 2012 | 10 Jul 2012 | lhc, cern, boson, higgs |
21:11:19 GMT | 20:59:56 GMT | ||
MLKing2013 | 25 Aug 2013 | 02 Sep 2013 | Martin Luther King |
13:41:36 GMT | 08:16:21 GMT | #ihaveadream | |
MoscowAthletics2013 | 05 Aug 2013 | 19 Aug 2013 | mos2013com, moscow2013, mosca2013 |
09:25:46 GMT | 12:35:21 GMT | moscu2013, #athletics | |
NYClimateMarch2014 | 18 Sep 2014 | 22 Sep 2014 | peopleclimatemarch, peoplesclimate |
22:46:19 GMT | 04:56:25 GMT | marciaxilclima, climate2014 | |
ObamaInIsrael2013 | 19 Mar 2013 | 03 Apr 2013 | obama, israel |
15:56:29 GMT | 21:24:34 GMT | palestina, peace |
Finally, we report that in a few cases we complemented a dataset by including tweets obtained from the search API (at most 5% of tweets with respect to the whole dataset) and that in the worst cases, the flow of streaming API was limited causing a loss of less than 0.5% of tweets.
To understand the dynamics of Twitter user interactions during these exceptional events, we reconstruct, for each event, a network connecting users on the basis of the retweets, mentions and replies they have been the subject or object of. In the literature on Twitter data what is usually built is the network based on the follower-followee relationships between users [ A user can A user can A user can
A fourth kind of possible interaction is to favourite a user's tweet, which represents a simple endorsement of the information contained in the tweet, without rebroadcasting. However, we do not have this kind of information for this dataset and therefore we do not consider this kind of interaction.
As just discussed, each kind of activity on Twitter (retweet, reply, and mention) represents a particular kind of interaction between two users. Therefore, an appropriate framework to capture the overall structure of these interactions without loss of information about the different types is the framework of multilayer networks [
Here, for each event, we build a multilayer network composed by
Details about the number of nodes and edges characterizing each event are reported in Table
Cannes2013 | N = 514,328 | 337,089 | 85,414 | 91,825 |
E = 700,492 | 490,268 | 82,952 | 127,272 | |
HiggsDiscovery2012 | N = 747,659 | 434,687 | 167,385 | 145,587 |
E = 817,877 | 542,808 | 122,761 | 152,308 | |
MLKing2013 | N = 346,069 | 286,227 | 24,664 | 35,178 |
E = 339,143 | 288,543 | 18,157 | 32,443 | |
MoscowAthletics2013 | N = 103,319 | 73,377 | 11,983 | 17,959 |
E = 144,591 | 102,842 | 12,768 | 28,981 | |
NYClimateMarch2014 | N = 115,284 | 94,300 | 7,900 | 13,084 |
E = 239,935 | 213,158 | 8,038 | 18,739 | |
ObamaInIsrael2013 | N = 2,641,052 | 1,443,929 | 737,353 | 459,770 |
E = 2,926,777 | 1,807,160 | 586,074 | 533,543 |
In the following we present an analysis of the networks introduced in the previous section, which is oriented at exploring two different but complementary questions.
Firstly we want to know if, within one same event, the three kinds of interactions produce different network topologies. To this aim, we consider basic multilayer and single-layer network descriptors relevant to characterize social relationships, and we study how they vary when considering different layers.
Secondly, we want to unveil if different exceptional events present any common pattern regarding users interactions. As shown in Figure
To understand if the kinds of interaction produce similar networks or not, we analyze if users interact similarly with each other regardless of the type of activity (retweet, reply, or mention), or not. This information can be obtained by calculating the edge overlap [
MT-RP | 0.05±0.04 | 0.50±0.12 |
MT-RT | 0.06±0.03 | 0.33±0.08 |
RP-RT | 0.08±0.04 | 0.35±0.10 |
In this section, we study the degree connectivity of users, the most widely studied descriptor of the structure of a network. We focus in particular on the in-degree
First, we explore if users have the same connectivity on the different layers, or not, i.e., if the users consistently have the same degree of importance on all the layers, or not. To this aim, we compute the Spearman's rank correlation coefficient [
Building on the result discussed in the previous section, we also explore, for each event, the distribution of the in-degree on the different layers, separately. Intriguingly, for each layer, we find that the empirical distributions corresponding to all the exceptional events present very similar shape, as shown in Figure
The in-degree, shown in Figure
Power-law distributions of the degree have been found in a large variety of empirical social networks [
Lastly, for each layer separately, we calculate the average clustering coefficient of the corresponding network. This is a measure of the transitivity of the observed interactions, and constitutes an important metric to characterize social networks [
In this paper we analyze six datasets consisting of Twitter conversations surrounding distinct exceptional events. The considered events span over very different topics: entertainment, science, commemorations, sports, activism, and politics. Our results show that, despite the different fluctuations in time and in volume, there are some statistical regularities across the different events. In particular, we find that the in-degree distribution of users and the clustering coefficient in each of the three layers (representing interactions based on retweet, replies, and mentions, respectively) are the same across the six different events. Our first conclusion is therefore that users behavior on Twitter—during exceptional events—presents some universal patterns.
Secondly, we show that different types of interactions between users on Twitter (retweeting, replying, and mentioning) generate networks presenting different topological characteristics. These differences were captured making use of the multilayer network framework: instead of discarding the information contained in the tweets regarding how users interact, we use this information to build a more complete representation of the system by means of three layers, each representing a different type of interaction. The fact that networks corresponding to different layer present different statistical properties is an important hint for models aiming at reproducing human behavior in online social networks. Our results indicate that, to faithfully represent how users interact, these models cannot be based on an aggregated view of the network and should account for all the different processes taking place in the system, separately.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
AA and MD were supported by the European Commission FET-Proactive project PLEXMATH (Grant No. 317614) and the Generalitat de Catalunya 2009-SGR-838. AA also acknowledges financial support from the ICREA Academia, James S. McDonnell Foundation and MINECO FIS2012-38266. EO is supported by James S. McDonnell Foundation.
1
2
3
4
5
6
7
8