Internet memes and networked individualism: A perfect couple?

Memes are conceptual troublemakers. While academics have been debating over their theoretical usefulness ever since Richard Dawkins coined the term back in 1976, internet users speak of memes daily, as uncontested givens. Recently, I’ve been thinking of ways to bridge the yawning gap between academic and popular discourses on memes. I agree with some of the criticism of the ways the term has been used so far, but still see it as a powerful concept for unpacking many aspects of digital culture. Users are on to something, and I believe that researchers should follow – carefully and critically, of course…

A drop in a memetic ocean (1)

As an initial step, I’d like to highlight three points that I made in a recent paper, An Anatomy of a YouTube Meme, each relating to a different question about memes:

What’s the difference between “memetic” and “viral”?
While “viral” and “memetic” are often used interchangeably, disentangling them may lead to a more nuanced understanding of digital culture. Looking mainly into videos, I suggest treating the “memetic” and the “viral” as two dynamically interconnected video-types. A viral video can be defined as a clip that spreads to the masses via digital “word-of-mouth” mechanisms without significant change. The memetic video, in contrast, invokes a different structure of participation. It is a popular clip that lures extensive creative user engagement in the form of parody, pastiche or mash-up. Leave Britney Alone, the Star Wars Kid, and the Hitler Downfall parodies are particularly famous drops in a memetic ocean. Off course, there is a temporal element lurking here: many memetic videos started-off as viral ones. While still unsung in academia, this distinction is part of popular discourse, as evident in Know Your Meme.

Both the viral and the memetic seem to fall in line with what Henry Jenkins calls “spreadable media”, yet the analytic distinction between them highlights two different aspects of participatory culture: the first relates to a mode of diffusion, the second to a prevalent mode of mimeses-based communication. While so far research has tended to focus on the diffusion of specific “viral” videos, probing practices of mimesis may enrich our understanding of cultural formation.

A drop in a memetic ocean (2)

What makes a video “memetic?”
My analysis of 30 “mega-memetic” videos yielded six common features: a focus on ordinary people, flawed masculinity, humor, simplicity, repetitiveness, and whimsical content. Each of these attributes marks the video as incomplete or flawed, thereby invoking further creative dialogue. In other words: it seems that “bad” videos make “good” memes in contemporary participatory culture. But this, of course, is only one suggestion based on one case study of memetic videos: probing text-based and image-based memes would probably lead to different stories.

Why are so many people re-making YouTube videos?
By the time you finish reading this post, thousands of new videos will have been uploaded to YouTube. A good chunk of them will be remakes, mash-ups or parodies of existing videos. The answer to the question of why so many people are doing this is far beyond the scope of a single post, article, or even book, but I’d like to play with one idea here. I suggest that re-creating popular videos is the cultural embodiment of what Barry Wellman and others describe as “networked individualism.” On the one hand, users who upload a self-made video demonstrate their creativity and uniqueness; on the other, derivative videos often relate to a common, widely shared memetic video. By this act of cultural referencing, users both construct their individuality and their affiliation with the YouTube community.

More about these points, and about other aspects of memetic videos, can be found in An Anatomy of a YouTube Meme over at New Media anf Society (and here is the pre-print version).

Feedback is absolutely welcome! (also to
I’m currently working on a book on internet memes (for MIT Press), and hope to include as many voices as I can. So I’d love to receive emails about weird memetic phenomena, as well as research papers on the topic.

Can an algorithm be wrong? Twitter Trends, the specter of censorship, and our faith in the algorithms around us

The interesting question is not whether Twitter is censoring its Trends list. The interesting question is, what do we think the Trends list is, what it represents and how it works, that we can presume to hold it accountable when we think it is “wrong?” What are these algorithms, and what do we want them to be?

(Cross posted from Culture Digitally.)

It’s not the first time it has been asked. Gilad Lotan at SocialFlow (and erstwhile Microsoft UX designer), spurred by questions raised by participants and supporters of the Occupy Wall Street protests, asks the question: is Twitter censoring its Trends list to exclude #occupywallstreet and #occupyboston? While the protest movement gains traction and media coverage, and participants, observers and critics turn to Twitter to discuss it, why are these widely-known hashtags not Trending? Why are they not Trending in the very cities where protests have occurred, including New York?

The presumption, though Gilad carefully debunks it, is that Twitter is, for some reason, either removing #occupywallstreet from Trends, or has designed an algorithm to prefer banal topics like Kim Kardashian’s wedding over important contentious, political debates. Similar charges emerged around the absence of #wikileaks from Twitter’s Trends when the trove of diplomatic cables were released in December of last year, as well as around the #demo2010 student protests in the UK, the controversial execution of #TroyDavis in the state of Georgia, the Gaza #flotilla, even the death of #SteveJobs. Why, when these important points of discussion seem to spike, do they not Trend?

Despite an unshakeable undercurrent of paranoid skepticism, in the analyses and especially in the comment threads that trail off from them, most of those who have looked at the issue are reassured that Twitter is not in fact censoring these topics. Their absence on the Trends listings is a product of the particular dynamics of the algorithm that determines Trends, and the misunderstanding most users have about what exactly the Trends algorithm is designed to identify. I do not disagree with this assessment, and have no particular interest in reopening these questions. Along with Gilad’s thorough analysis, Angus Johnston has a series of posts (1, 2, 3, and 4) debunking the charge of censorship around #wikileaks. Trends has been designed (and re-designed) by Twitter not to simply measure popularity, i.e. the sheer quantity of posts using a certain word or hashtag. Instead, Twitter designed the Trends algorithm to capture topics that are enjoying a surge in popularity, rising distinctly above the normal level of chatter. To do this, their algorithm is designed to take into account not just the number of tweets, but factors such as: is the term accelerating in its use? Has it trended before? Is it being used across several networks of people, as opposed to a single, densely-interconnected cluster of users? Are the tweets different, or are they largely re-tweets of the same post? As Twitter representatives have said, they don’t want simply the most tweeted word (in which case the Trend list might read like a grammar assignment about pronouns and indefinite articles) or the topics that are always popular and seem destined to remain so (apparently this means Justin Bieber).

The charge of censorship is, on the face of it, counterintuitive. Twitter has, over the last few years, enjoyed and agreed with claims that has played a catalytic role in recent political and civil unrest, particularly in the Arab world, wearing its political importance as a red badge of courage (see Shepherd and Busch).  To censor these hot button political topics from Trends would work against their current self-proclaimed purposes and, more importantly, its marketing tactics. And, as Johnston noted, the tweets themselves are available, many highly charged – so why, and for what ends, remove #wikileaks or #occupywallstreet from the Trends list, yet  let the actual discussion of these topics run free?

On the other hand, the vigor and persistence of the charge of censorship is not surprising at all. Advocates of these political efforts want desperately for their topic to gain visibility. Those involved in the discussion likely have an exaggerated sense of how important and widely-discussed it is. And, especially with #wikileaks and #occupywallstreet, the possibility that Twitter may be censoring their efforts would fit their supporters’ ideological worldview: Twitter might be working against Wikileaks just as Amazon, Paypal, and Mastercard were; or in the case of #occupywallstreet, while the Twitter network supports the voice of the people, Twitter the corporation of course must have allegiances firmly intertwined with the fatcats of Wall Street.

But the debate about tools like Twitter Trends is, I believe, a debate we will be having more and more often. As more and more of our online public discourse takes place on a select set of private content platforms and communication networks, and these providers turn to complex algorithms to manage, curate, and organize these massive collections, there is an important tension emerging between what we expect these algorithms to be, and what they in fact are. Not only must we recognize that these algorithms are not neutral, and that they encode political choices, and that they frame information in a particular way. We must also understand what it means that we are coming to rely on these algorithms, that we want them to be neutral, we want them to be reliable, we want them to be the effective ways in which we come to know what is most important.

Twitter Trends is only the most visible of these tools. The search engine itself, whether Google or the search bar on your favorite content site (often the same engine, under the hood), is an algorithm that promises to provide a logical set of results in response to a query, but is in fact the result of an algorithm designed to take a range of criteria into account so as to serve up results that satisfy, not just the user, but the aims of the provider, their vision of relevance or newsworthiness or public import, and the particular demands of their business model. As James Grimmelmann observed, “Search engines pride themselves on being automated, except when they aren’t.” When Amazon, or YouTube, or Facebook, offer to algorithmically and in real time report on what is “most popular” or “liked” or “most viewed” or “best selling” or “most commented” or “highest rated,” it is curating a list whose legitimacy is based on the presumption that it has not been curated. And we want them to feel that way, even to the point that we are unwilling to ask about the choices and implications of the algorithms we use every day.

Peel back the algorithms, and this becomes quite apparent. Yes, a casual visit to Twitter’s home page may present Trends as an unproblematic list of terms, that might appear a simple calculation. But a cursory look at Twitter’s explanation of how Trends works – in its policies and help pages, in its company blog, in tweets, in response to press queries, even in the comment threads of the censorship discussions – Twitter lays bare the variety of weighted factors Trends takes into account, and cops to the occasional and unfortunate consequences of these algorithms. Wikileaks may not have trended when people expected it to because it had before; because the discussion of #wikileaks grew too slowly and consistently over time to have spiked enough to draw the algorithm’s attention; because the bulk of messages were retweets; or because the users tweeting about Wikileaks were already densely interconnected. When Twitter changed their algorithm significantly in May 2010 (though, undoubtedly, it has been tweaked in less noticeable ways before and after), they announced the change in their blog, explained why it was made – and even apologized directly to Justin Bieber, whose position in the Trends list would be diminished by the change. In response to charges of censorship, they have explained why they believe Trends should privilege terms that spike, terms that exceed single clusters of interconnected users, new content over retweets, new terms over already trending ones. Critics gather anecdotal evidence and conduct thorough statistical analysis, using available online tools that track the raw popularity of words in a vastly more exhaustive and catholic way than Twitter does, or at least is willing to make available to its users. The algorithms that define what is “trending” or what is “hot” or what is “most popular” are not simple measures, they are carefully designed to capture something the site providers want to capture, and to weed out the inevitable “mistakes” a simple calculation would make.

At the same time, Twitter most certainly does curate its Trends lists. It engages in traditional censorship: for example, a Twitter engineer acknowledges here that Trends excludes profanity, something that’s obvious from the relatively circuitous path that prurient attempts to push dirty words onto the Trends list must take. Twitter will remove tweets that constitute specific threats of violence, copyright or trademark violations, impersonation of others, revelations of others’ private information, or spam. (Twitter has even been criticized (1, 2) for not removing some terms from Trends, as in this user’s complaint that #reasonstobeatyourgirlfriend was permitted to appear.) Twitter also engages in softer forms of governance, by designing the algorithm so as to privilege some kinds of content and exclude others, and some users and not others. Twitter offers rules, guidelines, and suggestions for proper tweeting, in the hopes of gently moving users towards the kinds of topics that suit their site and away from the kinds of content that, were it to trend, might reflect badly on the site. For some of their rules for proper profile content, tweet content, and hashtag use, the punishment imposed on violators is that their tweets will not factor into search or Trends – thereby culling the Trends lists by culling what content is even in consideration for it. Twitter includes terms in its Trends from promotional partners, terms that were not spiking in popularity otherwise. This list, automatically calculated on the fly, is yet also the result of careful curation to decide what it should represent, what counts as “trend-ness.”

Ironically, terms like #wikileaks and #occupywallstreet are exactly the kinds of terms that, from a reasonable perspective, Twitter should want to show up as Trends. If we take the reasonable position that Twitter is benefiting from its role in the democratic uprisings of recent years, and that it is pitching itself as a vital tool for important political discussion, and that it wants to highlight terms that will support that vision and draw users to topics that strike them as relevant, #occupywallstreet seems to fit the bill. So despite carefully designing their algorithm away from the perennials of Bieber and the weeds of common language, it still cannot always successfully pluck out the vital public discussion it might want. In this, Twitter is in agreement with its critics; perhaps #wikileaks should have trended after the diplomatic cables were released. These algorithms are not perfect; they are still cudgels, where one might want scalpels. The Trends list can often look, in fact, like a study in insignificance. Not only are the interests of a few often precisely irrelevant to the rest of us, but much of what we talk about on Twitter every day is in fact quite everyday, despite their most heroic claims of political import. But, many Twitter users take it to be not just a measure of visibility but a means of visibility – whether or not the appearance of a term or #hashtag increases audience, which is not in fact clear. Trends offers to propel a topic towards greater attention, and offers proof of the attention already being paid. Or seems to.

Of course, Twitter has in its hands the biggest resource by which to improve their tool, a massive and interested user base. One could imagine “crowdsourcing” this problem, asking users to rate the quality of the Trends lists, and assessing these responses over time and a huge number of data points. But they face a dilemma: revealing the workings of their algorithm, even enough to respond to charges of censorship and manipulation, much less to share the task of improving it, risks helping those who would game the system. Everyone from spammers to political activist to 4chan tricksters to narcissists might want to “optimize” their tweets and hashtags so as to show up in the Trends. So the mechanism underneath this tool, that is meant to present a (quasi) democratic assessment of what the public finds important right now, cannot reveals its own “secret sauce.”

Which in some ways leaves us, and Twitter, in an unresolvable quandary. The algorithmic gloss of our aggregate social data practices can always be read/misread as censorship, if the results do not match what someone expects. If #occupywallstreet is not trending, does that mean (a) it is being purposefully censored? (b) it is very popular but consistently so, not a spike? (c) it is actually less popular than one might think? Broad scrapes of huge data, like Twitter Trends, are in some ways meant to show us what we know to be true, and to show us what we are unable to perceive as true because of our limited scope. And we can never really tell which it is showing us, or failing to show us. We remain trapped in an algorithmic regress, and not even Twitter can help, as it can’t risk revealing the criteria it used.

But what is most important here is not the consequences of algorithms, it is our emerging and powerful faith in them. Trends measures “trends,” a phenomena Twitter gets to define and build into its algorithm. But we are invited to treat Trends as a reasonable measure of popularity and importance, a “trend” in our understanding of the term. And we want it to be so. We want Trends to be an impartial arbiter of what’s relevant… and we want our pet topic, the one it seems certain that “everyone” is (or should be) talking about, to be duly noted by this objective measure specifically designed to do so. We want Twitter to be “right” about what is important… and sometimes we kinda want them to be wrong, deliberately wrong – because that will also fit our worldview: that when the facts are misrepresented, it’s because someone did so deliberately, not because facts are in many ways the product of how they’re manufactured.

We don’t have a sufficient vocabulary for assessing the algorithmic intervention a tool like Trends. We’re not good at comprehending the complexity required to make a tool like Trends – that seems to effortlessly identify what’s going on, that isn’t swamped by the mundane or the irrelevant. We don’t have a language for the unexpected associations algorithms make, beyond the intention (or even comprehension) of their designers. We don’t have a clear sense of how to talk about the politics of this algorithm. If Trends, as designed, does leave #occupywallstreet off the list, even when its use is surging and even when some people think it should be there: is that the algorithm correctly assessing what is happening? Is it looking for the wrong things? Has it been turned from its proper ends by interested parties? Too often, maybe in nearly every instance in which we use these platforms, we fail to ask these questions. We equate the “hot” list with our understanding of what is popular, the “trends” list with what matters. Most importantly, we may be unwilling or unable to recognize our growing dependence on these algorithmic tools, as our means of navigating the huge corpuses of data that we must, because we want so badly for these tools to perform a simple, neutral calculus, without blurry edges, without human intervention, without having to be tweaked to get it “right,” without being shaped by the interests of their providers.

Why the Occupy Movements Do Not Lack Leadership

Despite the (not undeserved) hype about the role of social media in various occupy movement, I first heard about Occupy Wall Street  from a traditional face-to-face encounter with my roommate.  Bryce gave me the basics (Adbusters-instigated, Twitter-facilitated protest in Zoccotti Square) and suggested we check it out.  If I’m honest, my first encounter with the OWS left me somewhere between non-plussed and wryly amused. I was thrilled to see that they had a library* and impressed by the rigged shower system and seeming willingness of people to pick up trash and distribute food.  On the other hand, the (frequently-photographed) collection of hand painted signs showed the by-now oft-critiqued claim that OWS lacked a coherent ideological message.  I returned a week or so later to participate in the student-lead march from Washington Square to Zuccotti Park and was blown away by the number of marchers, and found myself not caring about the lack of a centralized cause, precisely because it enabled different groups to coalesce around peaceable unrest. I’ve been in Seattle this week for AoIR, and wandered around the much smaller but equally vibrant Occupy Seattle, where I pitched in at their budding library and went to some general meetings.
At AoIR, the protests were a frequent topic of conversation, both at panels and during informal conversation.  Repeatedly, I heard the movement referred to as leaderless.  In thinking about what I’d seen at OWS and Occupy Seattle, I couldn’t help feeling that this was a conceptual misstep.  The Occupy movements are in fact shot through with (and perhaps only functioning because) of an abundance of micro-leadership.  Rather than being leaderless, the movement is in fact leader-ful.  Spending even a little time at protests, it’s easy to see the presence of people who are contributing everyday acts of leadership within a bounded sphere of activism.  This is perhaps part of what is so confounding about the movements for political analysis.  It isn’t really ideological incoherence that is so startling here (think of the ideological complexity – if not hypocrisy – of the democractic party in the United States), it’s the lack of a central figure to serve as a speaker, a focal point or mouthpiece.  My claim that movements are leaderful shouldn’t be taken to mean that there is an overabundance of leaders such that more people shouldn’t mobilize and offer leadership skills, as these things are very much needed.  But for me at least, thinking of the OWS as a leaderful movement is both exciting and somewhat explanatory of the resurfacing anxiety over what the movement is, how to deal with it or explain it.  For advocates and supporters, it’s exciting in its democracy.  For opponents and critics, it’s anxiety ridden in the lack of a clear counterpart with which to parley, a contained discourse to critique.  And in its own right, that’s exciting too.

*Plug: I’m giving a talk at Mobility Shifts this weekend on the OWS People’s Library.  Come learn about local interventions of librarianship and DIY archives!

Using Off-the-shelf Software for basic Twitter Analysis

Mary Gray, Mike Ananny and I are writing a paper on queer youth and “Glee” for the American Anthropological Association’s annual meeting (yes, I have the greatest job in the world). This is a multi-methodological study by design, because traditional television viewing practices have become so complex. Besides traditional audience ethnography like interviews and participant observation, we are using textual analysis to analyze episode themes, and collected a large corpus of tweets with Glee-related hashtags. This summer, I worked with my high school intern, Jazmin Gonzales-Rivero, to go through this corpus of tweets and pull out useful information for the paper.

We’ve written and published a basic report on using off-the-shelf tools to see patterns and themes in large Twitter data set quickly and easily.


With the increasing popularity of large social software applications like Facebook and Twitter, social scientists and computer scientists have begun developing innovative approaches to dealing with the vast amounts of data produced and collected in such environments. For qualitative researchers, the methods involved can be daunting and unfamiliar. In this report, we outline some basic procedures for working with a large-scale Twitter data set to answer qualitative inquiries. We use Python, MySQL, and the word-cloud generator Wordle to identify patterns in re-tweets, tweet authors, dates and times of tweets, frequency of hashtags, and frequency of word use. Such data can provide valuable augmentation to qualitative inquiry. This paper is aimed at social scientists and humanities scholars with limited experience with big data and a lack of computing resources to do extensive quantitative research.

Marwick, A. and Gonzales-Rivero, J. (2011). Learning to Work with Large-Scale Twitter Data Sets: Using Off-The-Shelf Tools to Quickly and Easily See Tweet Patterns. Microsoft Research Social Media Collective Report, MSR-SMC-11-01, Cambridge, MA. [Download as PDF]

If you’re a seasoned computer scientist or a Big Data aficionado, the information in this paper will seem quite simplistic. But for those of us without programming backgrounds who study Twitter or other forms of social media, the idea of tackling a set of 450,000 tweets can seem quite daunting. In this paper, Jazmin and I walk step-by-step through the methods she used to parse a set of Tweets, using free and easily accessible tools like MySQL, Python, and Wordle. We hope this will be helpful for other legal, humanities, and social science scholars who might want to dip their foot into Big Data to augment more qualitative research findings.