Skip to content

Using Off-the-shelf Software for basic Twitter Analysis

October 6, 2011

Mary Gray, Mike Ananny and I are writing a paper on queer youth and “Glee” for the American Anthropological Association’s annual meeting (yes, I have the greatest job in the world). This is a multi-methodological study by design, because traditional television viewing practices have become so complex. Besides traditional audience ethnography like interviews and participant observation, we are using textual analysis to analyze episode themes, and collected a large corpus of tweets with Glee-related hashtags. This summer, I worked with my high school intern, Jazmin Gonzales-Rivero, to go through this corpus of tweets and pull out useful information for the paper.

We’ve written and published a basic report on using off-the-shelf tools to see patterns and themes in large Twitter data set quickly and easily.

Abstract:

With the increasing popularity of large social software applications like Facebook and Twitter, social scientists and computer scientists have begun developing innovative approaches to dealing with the vast amounts of data produced and collected in such environments. For qualitative researchers, the methods involved can be daunting and unfamiliar. In this report, we outline some basic procedures for working with a large-scale Twitter data set to answer qualitative inquiries. We use Python, MySQL, and the word-cloud generator Wordle to identify patterns in re-tweets, tweet authors, dates and times of tweets, frequency of hashtags, and frequency of word use. Such data can provide valuable augmentation to qualitative inquiry. This paper is aimed at social scientists and humanities scholars with limited experience with big data and a lack of computing resources to do extensive quantitative research.

Citation:
Marwick, A. and Gonzales-Rivero, J. (2011). Learning to Work with Large-Scale Twitter Data Sets: Using Off-The-Shelf Tools to Quickly and Easily See Tweet Patterns. Microsoft Research Social Media Collective Report, MSR-SMC-11-01, Cambridge, MA. [Download as PDF]

If you’re a seasoned computer scientist or a Big Data aficionado, the information in this paper will seem quite simplistic. But for those of us without programming backgrounds who study Twitter or other forms of social media, the idea of tackling a set of 450,000 tweets can seem quite daunting. In this paper, Jazmin and I walk step-by-step through the methods she used to parse a set of Tweets, using free and easily accessible tools like MySQL, Python, and Wordle. We hope this will be helpful for other legal, humanities, and social science scholars who might want to dip their foot into Big Data to augment more qualitative research findings.

Citation:

7 Comments leave one →
  1. October 6, 2011 4:09 pm

    great paper, Alice and Jazmin! I think this will be a very useful paper for my methods seminar discussion about using tweets in ethnographic work. Nice contribution (mind if I send the link to the AoIR list?)

    mg

  2. October 6, 2011 6:41 pm

    Nice. Do you have a reference on how to slurp the tweets off twitter?

    Thanks!
    Jo

    • October 7, 2011 1:03 pm

      Unfortunately we used someone who had access to the “firehose” for that. Best bet is recruiting your friendly neighborhood computer scientist/ social network analyst who’s already doing twitter analysis and ask them to capture it for you.

  3. October 13, 2011 9:39 am

    nice article ! analysis of tweeter can be handled by various software like google analytics and social media guru and etc

Trackbacks

  1. Computational Social Science on the cheap using Twitter « Will.Whim
  2. Bookmarks - NMSU

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,226 other followers

%d bloggers like this: