Analysis of Information SpRead In Social Media During Disasters

Analysis of Information Spread in Social Media during Disasters[1] [pic 1]

Kishore neppalli, University of North Texas

YAFENG WU, University of Virginia

Social media is playing a crucial role in spreading information during major events especially natural disasters. Previous research methods proposed, aim to identify/extract the informational posts that are useful for the emergency responders by capturing actionable data from the large volumes of data available in social web. Many of the posts are re-posted by several other users which is the popular way of spreading information in the social media. It is very important to see that the information that is posted on social media is escalated as intended. In this paper, we present a preliminary analysis of the dataflow in social media to observe the trends in propagation of the user-posts is in social media during disasters. Specifically, we analyze how fast a post is being reposted, what is the average life span of a post and how long a post is alive. In addition, we would like to explore what factors/features would enhance the task of identifying informative tweets from the conversational tweets. Due diligence has been given to study the trust average of the identified informative tweets. For this analysis, we used Twitter data that were posted during two events: Sandy Hurricane (October 2012) and Boston Marathon bombings (April 2013).

Categories and Subject Descriptors: C.2.2 [Social Media Analytics]: Network Protocols

General Terms: Social media, information spread, retweets.


In recent years, microblogging services have widely emerged with millions of users who utilize these services to share information. These services are used for various purposes such as sharing news, establishing communication and casual conversation. On the other hand, they become post point of what’s happening in the hour of disaster. It acts as a real-time environment to broadcast information during major events. Social media has become a mass-media platform for communication to share information effectively. As per the analysis in the resource, there are hundreds of millions of social media active users per month. For example, in Twitter, there are about 284 million active users per month, Facebook has 1.35 billion active users, Google+ 540 million active users. By this we can say that social media has become part of our daily life and is leading to exponential increase in the volume of the social media data.

In Twitter's short history, the counts went from 5,000 tweets per day in 2007 to 500,000,000 tweets per day in 2013, which represents a six orders of magnitude increase. The intermediate steps were 300,000 tweets per day in 2008, 2.5 million tweets per day in 2009, 35 million tweets per day in 2010, 200 million tweets per day in 2011, and 340 million tweets per day when Twitter celebrated its sixth year on March 21, 2012.

We used data from Twitter, a microblogging service that facilitates the users to broadcast messages. Retweeting is one of best conventions provided by Twitter which helps people to share the messages very easily. Most of these messages are conversational which are not so informative and cannot be utilized by the responders during disasters. But people are also using it to publish important information.

Twitter plays active role in spreading the news information from different users/organizations by establishing a channel to communicate during the event. As per the survey by Pew Research Center, Twitter served as major source of information during Hurricane Sandy. Their study assert that more the 30% of the analyzed data falls under news and information category which says that people are sharing valuable information. In twitter the post may be direct or derivative – where direct post means user writes the post and publish it, derivative means users reposts a post made by another user. In our study, we rely on this aspect to find the characteristics of the information flow. In this paper, we present the characteristics of the information propagation during Hurricane Sandy and Boston Marathon bombings which could help in finding the trends in the events.

Hurricane Sandy 2012. It was happened in Oct 2012 and is deadliest hurricane in the history of United States. This is the second costliest hurricane encountered by United States. The Hurricane started in Atlantic Ocean set its course to east coast effecting 24 states and made damage amounted to $65 billion in USA. During the more than 200,000 people are affected. During the disaster it is estimated that around 20 million tweets were posted about Sandy.

Boston Bombings 2013. This disaster was happened on April 15 2013 at 2:49 P.M EDT with explosion two pressure cooker bombs near the finish line on Boylston Street, killing 3 people and injuring approx. 264 people. After the bombings, public transportation and most public businesses and institutions were shutdown, this created a deserted urban environment situation.

