How Do Users Take Collective Action Against Online Platforms? CHI Honorable Mention

What factors lead users in an online platform to join together in mass collective action to influence those who run the platform? Today, I’m excited to share that my CHI paper on the reddit blackout has received a Best Paper Honorable Mention! (Read the pre-print version of my paper here)

When users of online platforms complain, we’re often told to leave if we don’t like how a platform is run. Beyond exit or loyalty, digital citizens sometimes take a third option, organizing to pressure companies for change. But how does that come about?

I’m seeking reddit moderators to collaborate on the next stage of my research: running experiments together with subreddits to test theories of moderation. If you’re interested, you can read more here. Also, I’m presenting this work as part of larger talks at the Berkman Center on Feb 23 and the Oxford Internet Institute on March 16. I would love to see you there!

Having a formalized voice with online platforms is rare, though it has happened with San Francisco drag queens, the newly-announced Twitter Trust and Safety Council or the EVE player council, where users are consulted about issues a platform faces. These efforts typically keep users in positions of minimal power on the ladder of citizen participation, but they do give some users some kind of voice.

Another option is collective action, leveraging the collective power of users to pressure a platform to change how that platform works. To my knowledge, this has only happened four times on major U.S. platforms: when AOL community leaders settled a $15 million class action lawsuit for unpaid wages, when DailyKos writers went on strike in 2008, the recent Uber class action lawsuit, and the reddit blackout of July 2015, when moderators of 2,278 subreddits shut down their communities to pressure the company for better coordination and better moderation tools. They succeeded.

What factors lead communities to participate in such a large scale collective action? That’s the question that my paper set out to answer, combining statistics with the “thick data” of qualitative research.

The story of how I answered this question is also a story about finding ways to do large-scale research that include the voices and critiques of the people whose lives we study as researchers. In the turmoil of the blackout, amidst volatile and harmful controversies around hate speech, harassment, censorship, and the blackout itself, I made special effort to do research that included redditors themselves.

Theories of Social Movement Mobilization

Social movement researchers have been asking how movements come together for many decades, and there are two common schools, responding to early work to quantify collective action (see Olson, Coleman):

Political Opportunity Theories argue that social movements need the right people and the right moment. According to these theories, a movement happens when grievances are high, when social structure among potential participants is right, and when the right opportunity for change arises. For more on political opportunity theory, see my Atlantic article on the Facebook Equality Meme this past summer.

Resource Mobilization Theories argue that successful movements are explained less by grievances and opportunities and more by the resources available to movement actors. In their view, collective action is something that groups create out of their resources rather than something that arises out of grievances. They’re also interested in social structure, often between groups that are trying to mobilize people (read more).

A third voice in these discussions are the people who participate in movements themselves, voices that I wanted to have a primary role in shaping my research.

How Do You Study a Strike As It Unfolds?

I was lucky enough to be working with moderators and collecting data before the blackout happened. That gave me a special vantage for combining interviews and content analysis with statistical analysis of the reddit blackout.

Together with redditors, I developed an approach of “participatory hypothesis testing,” where I posed ideas for statistics on public reddit threads and worked together with redditors to come up with models that they agreed were a fair and accurate analysis of their experience. Grounding that statistical work involved a whole lot of qualitative research as well.

If you like that kind of thing, here are the details:

In the CHI paper, I analyzed 90 published interviews with moderators from before the blackout, over 250 articles outside reddit about the blackout, discussions in over 50 subreddits that declined to join the blackout, public statements by over 200 subreddits that joined the blackout, and over 150 discussions in blacked out subreddits after their communities were restored. I also read over 100 discussions in communities that chose not to join. Finally, I conducted 90 minute interviews with 13 subreddit moderators of subreddits of all sizes, including those that joined and declined to join the blackout.

To test hypotheses developed with redditors, I collected data from 52,735 non-corporate subreddits that received at least one comment in June 2015, alongside a list of blacked-out subreddits. I also collected data on moderators and comment participation for the period surrounding the blackout.

So What’s The Answer? What Factors Predict Participation in Action Against Platforms?

In the paper, I outline major explanations offered by moderators and translate them into a statistical model that corresponds to major social movement theories. I found evidence confirming many of redditor’s explanations across all subreddits, including aspects of classic social movement theories. These findings are as much about why people choose *not* to participate as much as they are about what factors are involved in joining:

    • Moderator Grievances were important predictors of participation. Subreddits with greater amounts of work, and whose work was more risky were more likely to join the blackout
    • Subreddit Resources were also important factors. Subreddits with more moderators were more likely to join the blackout. Although “default” subreddits played an important role in organizing and negotiating in the blackout, they were no more or less likely to participate, holding all else constant.
    • Relations Among Moderators were also important predictors, and I observed several cases where “networks” of closely-allied subreddits declined to participate.
    • Subreddit Isolation was also an important factor, with more isolated subreddits less likely to join, and moderators who participate in “metareddits” more likely to join.
    • Moderators Relations Within Their Groups were also important; subreddits whose moderators participated more in their groups were less likely to join the blackout.

Many of my findings go into details from my interviews and observations, well beyond just a single statistical model; I encourage you to read the pre-print version of my paper.

What’s Next For My reddit Research?

The reddit blackout took me by surprise as much as anyone, so now I’m back to asking the questions that brought me to moderators in the first place:

THANK YOU REDDIT! & Acknowledgments

CHI_Banner

First of all, THANK YOU REDDIT! This research would not have been possible without generous contributions from hundreds of reddit users. You have been generous all throughout, and I deeply appreciate the time you invested in my work.

Many other people have made this work possible; I did this research during a wonderful summer internship at the Microsoft Research Social Media Collective, mentored by Tarleton Gillespie and Mary Gray. Mako Hill introduced me to social movement theory as part of my general exams. Molly Sauter, Aaron Shaw, Alex Leavitt, and Katherine Lo offered helpful early feedback on this paper. My advisor Ethan Zuckerman remains a profoundly important mentor and guide through the world of research and social action.

Finally, I am deeply grateful for family members who let me ruin our Fourth of July weekend to follow the reddit blackout closely and set up data collection for this paper. I was literally sitting at an isolated picnic table ignoring everyone and archiving data as the weekend unfolded. I’m glad we were able to take the next weekend off! ❤

Followup: 10 Factors Predicting Participation in the Reddit Blackout. Building Statistical Models of Online Behavior through Qualitative Research

Three weeks ago, I shared dataviz and statistical models predicting participation in the Reddit Blackout in July 2015. Since then, many moderators have offered feedback and new ideas for the data analysis, alongside their own stories. Earlier today, I shared this update with redditors.

UPDATE, Sept 16, 9pm ET: Redditors brilliantly spotted an important gap in my dataset and worked with me to resolve it. After taking the post down for two days, I am posting the corrected results. Thanks to their quick work, the graphics and findings in this post are more robust.


This July, moderators of 2,278 subreddits joined a “blackout,” demanding better communication and improved moderator tools. As part of my wider research on the work and position of moderators in online communities, I have also been asking the question: who joined the July blackout, and what made some moderators and subs more likely to participate?

Reddit Moderator Network July 2015, including NSFW Subs, with Networks labeled

Academic research on the work of moderators would expect that the most important predictor of blackout participation would be the workload, which creates common needs across subs. Aaron Shaw and Benjamin Mako Hill argue, based on evidence from Wikia, that as the work of moderating becomes more complex within a community, moderators grow in their own sense of common identity and common needs as distinct from their community (read Shaw and Hill’s Wikia paper here). Postigo argues something similar in terms of moderators’ relationship to a platform: when moderators feel like they’re doing huge amounts of work for a company that’s not treating them well, they can develop common interests and push back (read my summary of Postigo’s AOL paper here).

Testing Redditors’ Explanations of The Blackout

After posting an initial data analysis to reddit three weeks ago, dozens of moderators generously contacted me with comments and offers to let me interview them. In this post, I test hypotheses straight from redditors’ explanations of what led different subreddits to join the blackout. By putting all of these hypotheses into one model, we can see how important they were across reddit, beyond any single sub. (see my previous post) (learn more about my research ethics and my promises to redditors)

TLDR:

  • Subs who shared mods with other blackout subs were more likely to join the blackout, but controlling for that:
  • Default subs were more likely to join the blackout
  • NSFW subs were more likely to join the blackout
  • Subs with more moderators were slightly more likely to join the blackout
  • More active subs were more likely to join the blackout
  • More isolated subs were less likely to join the blackout
  • Subs whose mods participate in metareddits were more likely to join the blackout
  • Subs whose mods get and give help in moderator-specific subs were no more or less likely to join the blackout

In my research I have read over a thousand reddit threads, interviewed over a dozen moderators, archived discussions in hundreds of subreddits, and collected data from the reddit API— starting before the blackout. Special thanks to everyone who has spoken with me and shared data.

Improving the Blackout Dataset With Comment Data

Based on conversations with redditors, I collected more data:

  • Instead of the top 20,000 subreddits by subscribers, I now focus on the top subreddits by number of comments in June 2015, thanks to a comment dataset collected by /u/Stuck_In_the_Matrix
  • I updated my /u/GoldenSights amageddon dataset to include 400 additional subs, after feedback from redditors on /r/TheoryOfReddit
  • I include “NSFW” subreddits intended for people over 18
  • I account for more bots thanks to redditor feedback
  • I account for changes in subreddit leadership (with some gaps for subreddits that have experienced substantial leadership changes since July) In this dataset, half of the 10 most active subs joined the blackout, 24% of the 100 most active, 14.2% of the 1,000 most active, and 4.7% of the 20,000 most active subreddits.

To illustrate the data, here are two charts of the top 52,754 most active subreddits as they would have stood at the end of June. The font size and node size are related to the log-transformed number of comments from June. Ties between subreddits represent shared moderators. The charts are laid out using the ForceAtlas2 layout on Gephi, which has separated out some of the more prominent subreddit networks, including the ImaginaryNetwork, the “SFW Porn” Network, and several NSFW networks (I’ve circled notable networks in the network graph at the top of this post).

Reddit Blackout July 2015: Top 20,000 Subreddits by comments

Redditors’ Explanations Of Blackout Participation

With 2,278 subreddits joining the blackout, redditors have many theories for what experiences and factors led subs to join the blackout. In the following section, I share these theories and then test one big logistic regression model that accounts for all of the theories together. In these tests, I consider 52,745 subreddits that had at least one comment in June 2015. A total of 1,342 of these subreddits joined the blackout.

The idea of blacking out had come up before. According to one moderator, blacking out was first discussed by moderators three years ago as a way to protest Gawker’s choice to publish details unmasking a reddit moderator. Although some subs banned Gawker URLs from being posted to their communities, the blackout didn’t take off. While some individual subreddits have blacked out in the intervening years, this was the first time that many subs joined together.

I tested these hypotheses with the set of (firth) logistic regression models shown below. The final model (on the right) offers the best fit of all the models, with a McFadden R2 of 0.123, which is pretty good.

PREDICTING PARTICIPATION IN THE REDDIT BLACKOUT JULY 2015
Preliminary logistic regression results, J. Nathan Matias, Microsoft Research
Published on September 14, 2015
More info about this research: bit.ly/1V7c9i4
Contact: /u/natematias

N = top 52,745 subreddits in terms of June 2015 comments, including NSFW, for subreddits still available on July 2
Comment dataset: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
List of subreddits "going private": https://www.reddit.com/r/GoldTesting/wiki/amageddon 
Moderator network queried in June 2015, with gap filling in July 2015 and September 2015

==================================================================================================================
                                                                  Dependent variable:                             
                                      ----------------------------------------------------------------------------
                                                                        blackout                                  
                                         (1)        (2)        (3)        (4)        (5)        (6)        (7)    
------------------------------------------------------------------------------------------------------------------
default sub                             3.161***   1.065***   1.070***   0.814**    0.720**    0.693**    0.705**  
                                       (0.294)    (0.305)    (0.317)    (0.336)    (0.337)    (0.337)    (0.339)  
                                                                                                                  
NSFW sub                                0.179*     0.235**    0.268***   0.291***   0.288***   0.314***   0.313*** 
                                       (0.098)    (0.099)    (0.099)    (0.101)    (0.101)    (0.102)    (0.102)  
                                                                                                                  
log(comments in june 2015)                         0.263***   0.268***   0.246***   0.258***   0.256***   0.257*** 
                                                  (0.009)    (0.010)    (0.011)    (0.011)    (0.011)    (0.011)  
                                                                                                                  
moderator count                                               0.066***   0.055***   0.053***   0.051***   0.051*** 
                                                             (0.007)    (0.008)    (0.008)    (0.008)    (0.008)  
                                                                                                                  
log(comments):moderator count                                -0.006***  -0.005***  -0.005***  -0.004***  -0.004*** 
                                                             (0.001)    (0.001)    (0.001)    (0.001)    (0.001)  
                                                                                                                  
log(mod roles in other subs)                                            -0.293***  -0.328***  -0.334***  -0.332*** 
                                                                        (0.033)    (0.033)    (0.033)    (0.033)  
                                                                                                                  
log(mod roles in blackout subs)                                          2.163***   2.134***   2.134***   2.133*** 
                                                                        (0.096)    (0.096)    (0.096)    (0.096)  
                                                                                                                                                                                                                              
log(mod roles in other subs):log(mod roles in blackout subs)            -0.255***  -0.248***  -0.254***  -0.254*** 
                                                                        (0.017)    (0.017)    (0.017)    (0.017)  

log(sub isolation, by comments)                                                    -2.608***  -2.568***  -2.569*** 
                                                                                   (0.347)    (0.345)    (0.345)  
                                                                                                                  
log(metareddit participation per mod in june 2015)                                             0.100***   0.103*** 
                                                                                              (0.036)    (0.036)  
                                                                                                                  
log(mod-specific sub participation per mod in june 2015)                                                 -0.024  
                                                                                                         (0.063)  
                                                                                                                  
Constant                               -3.608***  -4.517***  -4.677***  -4.655***  -4.467***  -4.469***  -4.469*** 
                                       (0.028)    (0.050)    (0.054)    (0.058)    (0.060)    (0.060)    (0.060)  
                                                                                                                  
------------------------------------------------------------------------------------------------------------------
Observations                            52,745     52,745     52,745     52,745     52,745     52,745     52,745  
Log Likelihood                        -6,520.505 -6,171.874 -6,130.725 -5,861.099 -5,806.916 -5,803.188 -5,803.098
Akaike Inf. Crit.                     13,047.010 12,351.750 12,273.450 11,740.200 11,633.830 11,628.380 11,630.200
==================================================================================================================
Note:                                                                                  *p<0.1; **p<0.05; ***p

 

The network of moderators who moderate blackout subs is the strongest predictor in this model. At a basic level, it makes sense that moderators who participated in the blackout in one subreddit might participate in another. Making sense of this kind of network relationship is a hard problem in network science, and this model doesn’t include time as a dimension, so we don’t consider which subs went dark before which others. If I had data on the time that subreddits went dark, it might be possible to better research this interesting question, like Bogdan State and Lada Adamic did with their paper on the Facebook equality meme.

Hypothesis 1: Default subs were more likely to join the blackout

In interviews, some moderators pointed out that “most of the conversation about the blackout first took place in the default mod irc channel.” Moderators of top subs described the blackout as mostly an issue concerning default or top subreddits.

This hypothesis supported in the final model. For example, while a non-default subreddit with 4 million monthly comments had a 32.9% chance of joining the blackout (holding all else at their means), a default subreddit of the same size had a 48.6% chance of joining the blackout, on average in the population of subs.

Hypothesis 2: Subs with more comment activity were more likely to join the blackout

Moderators of large, non-default subreddits also had plenty of reasons to join the blackout, either because they also shared the need for better moderating tools, or because they had more common contact and sympathy with other moderators as a group.

Even among subreddits that declined to joint the blackout, many moderators described feeling obligated to make a decision one way or an other. This surprised moderators of large subreddits, who saw it as an issue for larger groups. Size was a key issue in the hundreds of smaller groups that discussed the possibility, with many wondering if they had much in common with larger subs, or whether blacking out their smaller sub would make any kind of dent in reddit’s advertising revenue.

In the final model, larger subs were more likely to join the blackout, a logarithmic relationship that is mediated by the number of moderators. When we set everything else to its mean, we can observe how this looks for subs of different sizes. In the 50th percentile, subreddits with 6 comments per month had a 1.6% chance of joining the blackout — a number that adds up with so many small subs. In the 75th percentile, subs with 46 comments a month had a 2.5% chance of joining the blackout. Subs with 1,000 comments a month had a 5.4% chance of joining, while subs with 100,000 comments a month had a 15.8% chance of joining the blackout, on average in the population of subs, holding all else constan.

Hypothesis 3: NSFW subs were more likely to join the blackout

In interviews, some moderators said that they declined to join the blackout because they saw it as something associated with support for hate speech subreddits taken down by the company in June or other parts of reddit they preferred not to be associated with. Default moderators denied this flatly, describing the lengths they went to dissociate from hate speech communities and sentiment against then-CEO Ellen Pao. Nevertheless, many journalists drew this connection, and moderators were worried that they might become associated with those subs despite their efforts.

Another possibility is that NSFW subs have to do more work to maintain subs that offer high quality NSFW conversations without crossing lines set by reddit and the law. Perhaps NSFW subs just have more work, so they were more likely to see the need for better tools and support from reddit.

In the final model, NSFW subs were more likely to join the blackout than non-NSFW subs. For example, while a non-default, non-NSFW subreddit with 22,800 of comments had a 11.4% chance of joining the blackout (holding all else at their means), an NSFW subreddit of the same size had a 15.3% chance of joining the blackout, on average in the population of subs. Among less popular subs, a non-NSFW sub with 1,000 comments per month had a 5.4% chance of joining the blackout, while an NSFW sub of the same size had a 7.5% chance of joining, holding all else constant, on average in the population of subs.

Hypothesis 4: More isolated subs were less likely to join the blackout

In the interviews I conducted, as well as the 90 or so interviews I read on /r/subredditoftheday, moderators often contrasted their communities with “the rest of reddit.” When I asked one moderator of a support-oriented subreddit about the blackout, they mentioned that “a lot of the users didn’t really identify with the rest of reddit.” Subscribers voted against the blackout, describing it as “a movement we didn’t identify with,” this moderator said.

To test hypotheses about more isolated subs, I parsed all comments in every public subreddit in June 2015, generating an “in/out” ratio. This ratio consists of the total comments within the sub divided by the total comments made elsewhere by the sub’s commenters. A subreddit whose users stayed in one sub would have a ratio above 1, while a subreddit whose users commented widely would have a ratio below 1. I tested other measures, such as the average of per-user in/out ratios, but the overall in/out ratio seems the best.

In the final model, more isolated subs were less likely to join the blackout, on a logarithmic scale. Most subreddit’s commenters participate actively elsewhere on reddit, at a mean in/out ratio of 0.24. That means that on average, a subreddit’s participants make 4 times more comments outside a sub than within it. At this level, holding everything else at their means, a subreddit with 1,000 comments a month had a 4.0% chance of joining the blackout. A similarly-sized subreddit whose users made half of their comments within the sub (in/out ratio of 1.0) had just a 1% chance of joining the blackout. Very isolated subs whose users commented twice as much in-sub had a 0.3% chance of joining the blackout, on average in the population of subs, holding all else constant.

Hypothesis 5: Subs with more moderators were more likely to join the blackout

This one was my hypothesis, based on a variety of interview details. Subs with more moderators tend to have more complex arrangements for moderating and tend to encounter limitations in mod tools. Sums with more mods also have more people around, so their chances of spotting the blackout in time to participate was also probably higher. On the other hand, subs with more activity tend to have more moderators, so it’s important to control for the relationship between mod count and sub activity.

I was wrong. In the final model, subs with more moderators were LESS likely to join the blackout. There is a very small relationship here, and the relationship is mediated by the number of comments. For a sub with 1000 comments per month, with everything else at its average, a subreddit with 3 moderators (the average) had 5.4% chance of joining the blackout. A subreddit with 8 moderators had a 6% chance of joining the blackout, on average in the population of subs.

Hypothesis 6: Subs with admins as mods were more (or less) likely to join the blackout

I heard several theories about admins. During the blackout, some redditors claimed that admins were preventing subs from going private. In interviews, moderators tended to voice the opposite opinion. They argued that subs with admin contact were joining the blackout in order to send a message to the company, urging it to pay more attention to employees who advocated for moderator interests. Moderators at smaller subs said, “we felt 100% independent from admin assistance so it really wasn’t our fight.”

None of my hypothesis tests showed any statistically significant relationship between current or past admin roles as moderators and participation in the blackout, either way. For that reason, I omit it from my final model.

Hypothesis 7: Subs with moderators who moderated other subs were more likely to join the blackout

I’ve been wondering if moderators with multiple mod roles elsewhere on reddit would be more likely to join the blackout, perhaps because they had greater “solidarity” with other subreddits, or because they were more likely to find out about the blackout.

In the final model, the reverse is supported. Subs that shared moderators with other subs were actually less likely to join the blackout, a relationship that is mediated by the by the number of moderators who also modded blackout subs. Holding blackout sub participation constant, a sub of 1,000 comments per month and 3 moderator roles shared with other subs had a 5.7% chance of joining the blackout, while a more connected sub with 6 shared moderator roles (in the 4th quantile) had a 4.2% chance of joining the blackout, on average in the population of subs, holding all else constant.

Hypothesis 8: Subreddits with mods who also moderate other blackout subs were more likely to join the blackout.

This hypothesis is also a carry-over from my previous analysis, where I found a statistically-significant relationship. Note that making sense of this kind of network relationship is a hard problem in network science, and that we can’t use this to test “influence.”

In the final model, subreddits with mods with roles in other blackout subs were more likely to join the blackout, a relationship on a log scale that is mediated by the number of moderator roles shared with other subs more generally. 19% of subs in the sample share at least one moderator with a blackout sub, after removing moderator bots. A sub with 1,000 comments per month that didn’t have any overlapping moderators with blackout subs had a 3.2% chance of joining the blackout, while a sub with one overlapping moderator had an 11.1% chance to join, and a sub with 2 overlapping moderators had a 21.1% chance to join. For a sub with 6 overlapping moderators with blackout subs, a sub had a 57.2% chance of joining the blackout.

I tend to see the network of co-moderation as a control variable. We can expect that moderators who joined the blackout would be likely to support it across the many subs they moderate. By accounting for this in the model, we get a clearer picture on the other factors that were important.

Hypothesis 9: Subs with moderators who participate in metareddits were more likely to join the blackout

In interviews, several moderators described learning about the blackout from “meta-reddits” which cover major events on the site, and which mostly stayed up during the blackout. Just like we might expect more isolated subs to stay out of the blackout, we might expect moderators who get involved in reddit-wide meta-discussion to join the blackout. I took my list of metareddits from this TheoryOfReddit wiki post.

In the final model, subs with moderators who participate in metareddits were more likely to join the blackout, on a logarithmic scale. Most moderators on the site do not participate in metareddits. A sub of 1,000 comments per month with no metareddit participation by its moderators had a 5.3% chance of joining the blackout, while a similar sub whose moderators made 5 comments on any metareddit per month had a 6.3% chance of joining the blackout.

Hypothesis 10: Subs with mods participating in moderator-focused subs were more likely to join the blackout

Although key moderator subs like /r/defaultmods and /r/modtalk are private and inaccessible to me, I could still test a “solidarity” theory. Perhaps moderators who participate in mod-specific subs, who have helped and been helped by other mods, would be more likely to join the blackout?

Although this predictor is significant in a single-covariate model, when you account for all of the other factors, mod participation in moderator-focused subs is not a significant predictor of participation in the blackout.

This surprises me. I wonder: since moderator-specific subs tend to have low volume, one month of comments may just not be enough to get a good sense of which moderators participate in those subs. Also, this dataset doesn’t include IRC discussions (nor will it ever), where moderators seem mostly to hang out with and help each other. But from the evidence I have, it looks like help from moderator-focused subs played no part to sway moderators to join the blackout.

So, how DID solidarity develop in the blackout?

The question is still open, but from these statistical models, it seems clear that factors beyond moderator workload had a big role to play, even when controlling for mods of multiple subs that joined the blackout.

In further analysis in the next week, I’m hoping to include:

  • Activity by mods in each sub (comments, deletions)
  • Comment karma, as another measure of activity (still making sense of the numbers to see if they are useful here)
  • The complexity of the subreddit, as measured by things in the sidebar (possibly)

Building Statistical Models of Online Behavior through Qualitative Research

The process of collaborating with redditors on my statistical models has been wonderful. As I continue this work, I’m starting to think more and more about the idea of participatory hypothesis testing, in parallel with work we do at MIT around a Freire-inflected practices of “popular data“. If you’ve seen other examples of this kind of thing, do send them my way!

Moderator Solidarity on Reddit: Predicting Participation in the Blackout of July 2015

For the last 40 years or more, online platforms have relied on people to facilitate and support our online communities. In the early 70s, they were the librarians and shopkeepers of Community Memory. In the 80s, they were the WELL’s “conference hosts.” In the 90s, they were AOL’s “community leaders.” In 2015, they are Wikipedia’s “administrators,” Facebook’s “admins,” Slashdot’s “moderators,” or XBOX’s “enforcement united.” And on platforms like Twitter without moderators, we find the need to invent them. These moderators are the founders, designers, promoters, facilitators recruiters, legislators, responders, and enforcers of our online social interactions.

This summer, I’ve been doing qualitative research on ways that Reddit moderators develop common interests as they face the company, as they face their subscribers, and as they relate to other moderators. Just in the top 20,000 subreddits by subscribers, Reddit has 50,790 moderators. This July, moderators of 2,278 subreddits joined a “blackout,” demanding better communication and improved moderator tools. The blackout is one moment in the wider research I’m doing, a moment where tensions and common cause rose to the surface. Blacked-out subreddits constituted 60% of the top 10 subreddits, 29% of the top 100, and 5% of the top 20,000 subreddits, representing a total of 134.8 million combined subscriptions.

Since I can only get so far by reading Reddit threads, I’m now interviewing Reddit moderators to learn more about your experience as a moderator and your experience of the blackout. If you are interested to talk, please message me on Reddit at /u/natematias.

Work In Progress: Charting the Reddit Blackout of July 2015

Since I’m also a software engineer and quantitative researcher, I’ve been complementing my qualitative work with data analysis on what I was able to collect from the public API, combined with /u/GoldTesting’s dataset of blackout participation. Mostly, I’ve used that data to decide where to look and who to reach out to. The conversations I found led me to think about several hypotheses I could also test statistically:

When moderators discussed the blackout with their subscribers, many debated the idea of “solidarity,” wondering if they were too small to have common cause with larger subs or if they were too small to make a difference. Others expressed strong opinions that joining the blackout meant standing with other moderators or standing for Reddit users as a whole.

The conversations I found led me to think about several hypotheses I could test statistically:

H1: Larger subreddits were more likely to join the blackout, maybe because their moderators were part of ModTalk, where much of the blackout was discussed, or because they felt a blackout would make a difference, or because they felt common cause with other mods of large subs.

H2: Subreddits with more moderators were more likely to join the blackout, perhaps mods in these subs would have greater solidarity with others.

H3: Subreddits with mods who also moderate other subreddits that participated in the blackout were more likely to join the blackout

To illustrate the data used for my statistical tests, here are two network graphs of shared moderators between subreddits. The first graph includes the top 20,000 subreddits in terms of subscribers (as of mid-June 2015). The graph one filters only subreddits with more than 10,128 subscribers. In the network graphs, subreddits that did not black out are tinted blue, while yellow-tinted subreddits joined the blackout.

Reddit Blackout July 2015: Top 20,000 Subreddits

Reddit Blackout July 2015: Subreddits with >10,000 Subscribers

The charts are laid out using the ForceAtlas2 layout on Gephi, which has separated out some of the more prominent subreddit networks, including the ImaginaryNetwork, the “SFW Porn” Network, and toward the center, the ShitRedditSays “fempire”. These networks are notable because some of them made network-wide decisions about their participation in the blackout.

Using this dataset, I conducted a logistic regression testing the above hypotheses.

Predicting Participation in the Reddit Blackout, July 2015

H1: Larger subreddits were more likely to join the blackout. This hypothesis is supported. On average in the population of top 20k subreddits, there is a large positive relationship between the log-transformed subscriber count and a subreddit’s probability of joining the blackout, holding all else constant.

H2: Subreddits with more moderators were more likely to join the blackout. This hypothesis is supported, very very weakly. I wouldn’t make much of this.

H3: Subreddits with mods who also moderate other subreddits that participated in the blackout were more likely to join the blackout. This is supported. On average in the top 20,000 subreddits, there is a positive relationship between the log of moderator roles in other blackout subs and a subreddit’s probability of joining the blackout, a relationship that is mediated by the overall number of moderators shared with other subs, holding all else constant.

So, is there evidence of moderator “solidarity” ? Yes, if we consider H1 to be a test of solidarity associated with similar subscriber numbers, and if we consider H2 to be a test of solidarity related to the number of moderators one works together with, then yes, we see support for the solidarity hypothesis. However, my qualitative research shows that many subreddits voted on this issue, indicating that subscribers also matter to this picture. Furthermore, many mods of smaller subs also expressed solidarity, even if smaller subs were less likely to participate. So more work needs to be done.

CAVEATS: This is just a preliminary statistical test. I have much more work to do before publication:

  • I need to define better hypotheses that can answer theoretically-meaningful questions
  • I need to do much more work to confirm the validity of my data collection, data processing, and models
  • I need better definitions of “solidarity”
  • This needs to be peer reviewed

In particular, I plan to spend more time with network scientists to understand the best way to set up my dataset for statistical analysis. There are many ways to project a complex network onto a single table for statistical tests, and I may need to try a different approach. Note also that this model does not include time as a factor, and that I use the term “predict” to refer to statistical inference rather than some ability to predict participation in the blackout before it occurred.

I’m sharing these preliminary results because I hope they’ll attract interest from Reddit moderators, and hopefully lead me to more interviews and data while I still have time to talk to people and enrich my understanding of what happened. If you are a Reddit moderator and want to talk with me, please message me at /u/natematias.

What Just Happened on Reddit? Understanding The Moderator Blackout

Last Thursday and Friday, moderators of many of Reddit’s most popular discussion groups “blacked out” their subreddits, preventing access to parts of the site by Reddit subscribers and cutting off some of the company’s advertising revenue for half a day. What may not have started as a protest quickly became one, with many moderators complaining that the company needed to offer better communication and better tools to its volunteer moderators. Reddit’s management responded within hours, apparently after substantive negotiations with moderators, and promised to meet those demands.

This story was covered widely in the press last weekend, with the MediaCloud project tracking 92 articles in the mainstream media and 51 in its “tech blogs” dataset.

As a PhD candidate spending my summer researching the work of moderators on Reddit, I’ve been asked repeatedly by journalists to share my results. I’ve resisted commenting, because we often want easy answers in the heat of the moment: will Reddit survive, what do I think about Reddit CEO Ellen Pao, are moderators are exploited labor, a “product being sold” to advertisers? In my research this summer, I’m trying to go beyond these important, near-term questions to understand the work that Reddit moderators do and how they see it.

Although it’s too early to share my results, I *can* share some of what I’ve found. I hope this post is useful to journalists writing about the Reddit blackout, and I hope that Reddit moderators read this too, so you can tell me what I am getting right, what I’m misunderstanding, and what conversations I’m missing.

  1. Why Does This Matter?
  2. What Is a Subreddit?
  3. What Do Reddit Moderators Do?
  4. How Do You Become a Moderator on Reddit?
  5. How Many Moderators Are There?
  6. What Does It Mean to “Go Dark,” “Go Private,” or “Black Out” and is This A New Thing?
  7. How Did Moderators Decide to Take Subreddits Private?
  8. What Were the Consequences of Taking Subreddits Private?
  9. Who’s In The Majority? What Do “Reddit Users” Think?
  10. Final Thoughts and Next Steps

How I’m Doing My Research

In this post, I avoid linking or directly mentioning specific Reddit users or subreddits for research ethics reasons. They’ve had a hard enough week without me sending more attention their way. Read more about my methods, ethics, and promises to Reddit users here.

Even before this weekend’s controversy, I had analyzed 50 interviews with groups of subreddit moderators, constructed a historical timeline on the history of the idea of subreddits and the role of moderators, followed hundreds of job board postings where moderators apply and accept moderating roles, and watched videos about the work of moderators. I also collected summary statistics from the Reddit API to understand how many moderators there are. Finally, I have personal experience facilitating and moderating a high profile online community, The Atlantic’s Twitter book club, which I moderated from 2012-2014.

Since the blackout started, I have spent most of my waking hours archiving and reading material about the controversy, including:

  • Over 500 links that appeared in the Reddit Live feed on the blackout, a feed maintained by Reddit users
  • Data on which subreddits went private, and by implication, which did not
  • Over a hundred of messages stating why “this subreddit is private” during the blackout
  • Hundreds of discussions in subreddits debating if they should go private
  • Around 50 discussions in subreddits that chose not to go private (I’m still adding to this)
  • “We’re back” discussions, where moderators justified, defended, or apologized for the decision to go private
  • Notable discussions in “meta-subreddits” where users from across the site reflected and responded to the issue
  • Historical records of other times that moderators made subreddits private
  • Limitation: I do not currently have access to the private subreddits where moderators of top subs discussed their decisions and goals, nor the conversations between mods and the company. Even though I can offer assurances of privacy, anonymity, and security in my archival of those conversations, moderators of the largest subreddits have not at this point trusted me to participate, a choice that I can understand.
  • Limitation: I haven’t archived conversations off-reddit where people claiming to be moderators have discussed the issue, because I have no way of confirming that they are actually moderators. The one exception is journalistic interviews or op-eds that name the moderators.

It’s too soon for me to draw conclusions from such a wide-ranging dataset, but I mention it in case there are important conversations I’m missing. If you’re a reddit moderator, or if you mod a sub where moderators discuss these issues, please contact me at u/natematias.

1. Why Does This Matter?

Reddit is one of the world’s most popular social/content platforms, with roughly half of Twitter’s monthly active visitors and 2/5 the monthly unique visitors of Wikipedia. YouTube, Facebook, and many news sites review content through flagging systems, with large numbers of paid staff reviewing content. For example, the Huffington Post pre-moderates 450,000 comments per day, paying between $0.005 and $0.25 for every comment. When The Verge recently turned off comments, worried that “sometimes it gets too intense,” they may also be saving a *lot* of money. Reddit mostly relies on its volunteer moderators to support and maintain conversations on the site, and the company has traditionally offered them substantial autonomy in return.

My research on the Reddit moderators isn’t just about Reddit. Anyone who cares about a fair, free, and meaningful social web should be paying close attention to sites like Reddit, Meetup, Craigslist, and Wikipedia that rely mostly on user initiative. If volunteer moderators, upvoting systems, and other community-driven approaches to supporting large-scale collective projects ultimately fail (and there are many ways to fail), it will be hard to justify anything but a fully-commercial web. At the same time, platforms are also creating new categories of work that defy the boundaries and expectations of mid-20th century labor, new categories that also create new problems.

Whatever the wider issues, the blackout also matters deeply to millions of subreddit moderators and subscribers. Content, conversations and relationships on Reddit are fully a part of many people’s lives. In addition to books, jokes, porn, deals, advice, inspiration, debates, and news, people also sometimes go to Reddit to ask for feedback on intimate questions they would never dare ask anywhere else, including help with thoughts of suicide or responses to their religious and political doubts. Sometimes, when pseudonymity is not enough; users create throwaway accounts to ask especially sensitive questions.

Moderators on Reddit have a great responsibility of care for those who participate in their groups. They also have a great deal of pressure and scrutiny from their subscribers. When discussing the decision to go private, many moderators described the difficulty of weighing the cost that this choice entailed. I hope I do their work justice in this post.

2. What Is a Subreddit?

Subreddits, conversation groups on Reddit, are often compared to forums, mailing lists, and earlier bulletin board systems. Contributions can usually be up or down voted, and are then algorithmically sorted on each subreddit’s front page. Unlike earlier discussion platforms, users on Reddit can move between public subreddits without having to create new user accounts, and contributions will sometimes surface on other parts of the site based on how popular they are.

Each subreddit has its own volunteer moderation team, who have wide ranging influence over the visual style, rules, and operation of that subreddit. Importantly, many of the popular subreddits are configured so that moderators don’t pre-approve participants; instead, they tend to take a reactive approach to behavior on their “subs.”

The ease of finding, joining, and participating in a new subreddit might be one reason that many users talk about “Reddit” culture. Many moderators describe their own communities as nicer, more welcoming and supportive than the “rest of Reddit.” This impression is at least partly shaped by the flow of newcomers who arrive when a sub becomes momentarily prominent due to highly upvoted content, a special event like a live Q&A (called AMAs), or “drama” among subscribers.

The commingling and collision of different conversations on Reddit is similar to what danah boyd came to call “context collapse” in her early 2000s research on Friendster. On Friendster, boyd observed burning man attendees, gay men and geeks responding to the discovery that they were conversing on the same platform. Reddit is designed to facilitate context collapse at speed and scale, supported by popularity algorithms that tend to draw attention to upvoted content and “drama” alike.

Reddit’s algorithms were the reason Reddit created the very first subreddit in January of 2006, its “NSFW” section. Trying to use popularity and voting systems to curate the “Front Page of the Internet,” Reddit’s creators noticed that porn and other complicated material was being promoted to the top of the page. By creating an “NSFW” section (the name “subreddit” came a month later) and excluding it from the front page, the company could decide which conversations to promote without interfering with the autonomy of user voting.

Over the next two years, the company started dozens of new subreddits, mostly to separate conversations happening in different languages. Then in Jan 2008, a year and a half after its acquisition by Condé Nast, and 10 months after introducing ads, the company launched “user-controlled subreddits.” Before then, users could join official company subreddits, reporting spam and abuse directly to the company. Now they could create their own public and private subreddits, taking action themselves to “remove posts and ban users.” Although subreddits have evolved since then, the basic structure has remained much the same.

Subscribing to a subreddit does not always imply an idea of “membership” in a “community.” Many users treat subreddits as newsfeeds. The default view for logged-in users uses a news feed algorithm to create “your front page” from “hot” posts across all of your subscriptions. As with the Facebook newsfeed, users subscribing to subreddits this way will only see a few of the most prominent posts.

3. What Do Reddit Moderators Do?

Recent press coverage has focused on the work of moderators to filter the content and conversations that are posted to the site. Moderator teams do much more. They are:

  • founders, entrepreneurially creating new subreddits and growing their subscriber base.
  • designers, creating unique styles for their subreddits, designing ads to attract other users to their sub, writing copy for the sub’s public-facing materials as well as its wiki. Moderators also design and customize the bots that help them do their work and participate in the sub’s conversations.
  • facilitators, maintaining the structure of conversation on their sub, whether through AMAs, weekly discussions, contests, or votes. Moderators also participate in discussions.
  • recruiters and promoters, promoting the subreddit to subscribers, soliciting contributions, and recruiting other moderators.
  • legislators and judges, discussing and defining the rules on the subreddit’s sidebar and wiki, as well as taking actions to enforce what they think the conversation ought to be.
  • responders, taking actions to respond to internal “drama” and external sources of influence, which may be welcome or unwelcome.

Much of this work is made possible through special features that Reddit makes available to moderators, alongside custom software that non-employees have created, from bots to browser plugins.

Moderators are not the only people to do this work. Subscribers are often very active in these areas too, as Brian Butler observed of mailing lists in the late 90s. Moderators’ actual behavior is also not always so neatly defined or benevolent as this list implies, and they vary widely in the effort and attention they give to subs.

My own understanding of moderators’ work is evolving as I continue to read and observe their work across the site.

4. How Do You Become a Moderator on Reddit?

The simplest way to become a moderator is to start your own subreddit. Most moderators of more popular subreddits are added by other moderators, through a variety of processes:

  • A friend outside Reddit asks you to do it as a favor
  • You see a call for help from a moderator on a subreddit you subscribe to
  • You follow the job board where moderators post moderating opportunities
  • After you become known for your capability at some aspect of moderating (CSS, bots, diplomacy), you are approached by moderators to join the mod team
  • Hoping to build your reputation and connections to moderators, you do an internship in one of the subreddits

Just as other moderators can add you as a moderator to a subreddit, they also have the power to remove you from the sub.

5. How Many Moderators Are There?

There are roughly as many moderator accounts as subreddits. In a random sample of 100,615 subreddits (roughly 1/6 of all public subreddits), I found 91,563 unique moderator accounts. A similar proportion of moderator accounts supports Reddit’s top conversations. A sample of the 9,880 subreddits with the greatest number of subscribers had around 9,900 moderators, with an average of 5 moderators per subreddit, after taking out bots.

Some moderator accounts are likely throwaway accounts, where a single moderator uses multiple personas to support different subreddits. Bots have their own moderation accounts. I’ve also seen numerous cases where the moderators use a single account to distinguish when they are speaking for the entire mod team and when they’re speaking in a personal capacity.

Finally, because some moderators specialize on things like bots or CSS, some users are moderators of very large numbers of subreddits.

6. What Does It Mean to “Go Dark,” “Go Private,” or “Black Out” and is This A New Thing?

Moderators have the power to make their subreddits private, which prevents anyone who is not explicitly approved from accessing or contributing o the subreddit. In a large public subreddit, this action has the effect of preventing almost everyone on Reddit, including most subscribers, from accessing or posting to the subreddit. All of the content of the subreddit also disappears from the public web, and given enough time, may also disappear from search results.

Reddit may possibly lose advertising on private subreddits, since the content is not public. However, it’s also possible that the controversy on Reddit could have attracted even more attention and revenue to the site. There is some evidence from subscription bots that subreddits that stayed up received unusually high numbers of new subscribers during the blackout. (An economist would find this question fascinating, if Reddit ever chose to share its advertising data.)

Moderators have taken subreddits private before, and while I’m still studying the history of this tactic, I’ve seen it used mostly to deal with internal or external drama.

External drama: Moderators sometime take a subreddit private to protect it from large waves of attention from elsewhere on the site. This can happen when a subreddit becomes unexpectedly promoted by algorithms to the site’s front page, when an internal controversy gets onto the “drama” subreddits, or when other subreddits try to “brigade” a group by influencing the votes of its comments. In these situations, it can be hard for moderators to deal with comments from people who don’t care or don’t yet understand the norms of their group. Moderators do have other ways to prevent or deal with this problem, like removing their subreddit from Reddit’s main feed or default listings. Making their sub private is a last line of defense.

Internal drama: Other moderators make their subreddit private to show their displeasure with subscribers.

I know of one case where making a subreddit private was used to put pressure on a company. In this case, a moderator of a gaming-related subreddit was unhappy with that company’s handling of a beta program. To pressure the company to change its policy, this moderator blacked out the fan conversation on Reddit.

Blacking out a subreddit can make its subscribers angry. On this gaming subreddit, some subscribers retaliated by “doxxing” the moderator, finding and posting the moderator’s sensitive personal information. In response to this internal drama, the moderator temporarily took the subreddit private again as a defense against their attacks.

This week, when two moderators of the IAmA subreddit claimed in the New York Times that they weren’t intending to start a protest by setting their subreddit private, it’s not unimaginable. If you’re worried about a huge influx of controversy into your subreddit due to a surpsie HR decision by the company, blacking out is one of the things a moderator can do to to gain the breathing space to respond– even if it is probably the most extreme response short of deleting the group entirely.

What made last weekend so unique was that moderators of so many subreddits blacked out on the same day, many of them expressing support for a set of demands for which they could at least find solidarity. That appears to be new.

Some subreddits are now adding timers to their sidebar, promising to black out again if Reddit doesn’t make satisfactory changes.

7. How Did Moderators Decide to Take Subreddits Private?

Over the last week, I’ve archived hundreds of conversations in subreddits as they decided if they should join or not. While many subreddits showed no evidence that moderators ever discussed the idea with their subscribers, many of them discussed it or put it to a vote.

Because the controversy and blackout happened so quickly, many moderators missed it completely. In some cases, moderators asked subscribers if they should join, only to be told that the subreddit had already blacked out and already concluded their participation. In some cases, moderators made unilateral decisions that were later reversed by other moderators, sometimes leading the original actor to lose their position.

In many other cases, moderators did often say that they had discussed the idea among themselves, often talking about their actions as a group rather than as individuals. Other moderators refer to deliberations with the company and other subreddits’ moderators, conversations that I don’t have access to.

When staying open, moderators sometimes justified their choice by describing the harm that could result, especially among subreddits that offer direct support to people with urgent needs. In several of those cases, moderators took heavy criticism from their subscribers for declining to join the protest.

8. What Were the Consequences of Taking Subreddits Private?

Although the press has focused on the pressure that Reddit is under from its moderators, those moderators have also been under great pressure from Reddit users, whose social lives they abruptly disrupted. To study these pressures, I have collected an archive of “We’re back” conversations where moderators justified and defended their blackout decisions.

In their complaints, many subscribers drew parallels between Reddit’s treatment of moderators and some moderators’ lack of communication with their own subreddits. When IAmA moderators Lynch and Swearingen wrote in the New York Times, “Our goal is not to cripple Reddit or hinder the community. We are all the community,” they echoed language that many other moderators used to win over their worried and upset subscribers.

At the same time, declining to go private also risked moderators’ legitimacy with subscribers. Many Reddit users supported the blackout, pressuring moderators to join in. Some of those supporters were opposed to Reddit’s staff and CEO in general– the Change.org petition calling for her dismissal was originally created weeks ago by subscribers who wanted the company to reinstate fat-shaming groups. Other subscribers expressed support for Victoria Taylor, the employee whose abrupt termination sparked the blackout. Moderators who declined to go private likely found their leadership questioned.

9. Who’s In The Majority? What Do “Reddit Users” Think?

I don’t think this is the right question. There is a huge variation in how different groups of moderators and subscribers handled this issue, and I’m still reading through it all. So far, my research will be based on the public conversations that moderators had with their groups, but if there are other conversations that I should know about before putting it into the scholarly record, please contact me.

10. Final Thoughts and Next Steps

I’m still growing my sense of what happened, why it matters, and what this episode can reveal about the more enduring questions of what it means to do volunteer work in online communities. I hope this post helps answer basic questions about subreddits, what moderators do, and the history of going private.

I also hope it helps Redditors understand more about the state of my research as I continue to ask questions. If you’re a reddit moderator who thinks I’m missing something, or if you mod a sub where moderators have been discussing these issues, please contact me at u/natematias.

Imagining a Sustainable and Inclusive Approach to Child Safety Online

What would be a sustainable and inclusive approach to child safety online? Today at the Berkman Center, Mitali Thakor presented her research on human trafficking and moderated a discussion of how we see and respond to issues of child safety.

Mitali Thakor is a PhD student at MIT’s history and science of technology program, who studies sex work, sex trafficking, technology, and digital forensics. She uses Feminist STS and critical race studies to explore the ways in which activists, computer scientists, lawyers, and law enforcement officials negotiate their relationships to anti-trafficking via emergent technologies and discourses of carceral control.

“What is human trafficking?” asks Mitali. In this growing “industry” of activism, there are so-called abolitionist networks, alliances between evangelical abolitionist Christian organizations committed to fighting prostitution and sex work aligning with feminist organizations who fight sex work, which they see as sexual exploitation. Mitali shows us campaigns by feminist organizations and Christian organizations working together. In her research, she’s interested in the peculiar alliances and valences of this particular anti-trafficking network.

This network uses metaphors of slavery, and idealized ideas of what freedom is about. Men are often imagined as “defenders against slavery.” One organization has a campaign called “The Defenders USA,” where you get your own shield and sword to be a defender against prostitution.

What happens when evangelicals and feminist activists work together– how does that affect our trafficking policies? Mitali says that in 2001, a UN protocol on trafficking began to inform how most countries approach a wide variety of issues from trafficked labor to pornography and sex work. In the US, responses tend to be focused on sexual exploitation rather than wider labor exploitation. Although the agriculture industry dominates US trafficking, the focus on sexual exploitation is associated with a “rescue industry” and heavy involvement of law enforcement. This approach, called “carceral feminism” by some feminist scholars, often involves NGOs and the state working together.

Mitali tells us the story of Monica Jones, a black trans woman social worker in Phoenix, who was arrested by the police in collaboration with anti-trafficking organizations. The ACLU has called this “arrested for walking while trans.” A court has judged her trial unfair and opened it up for retrial. Mitali says that this is one example where carceral feminism involves the policing of sexuality and the incarceration of marginalized groups.

As a PhD student at HASTS, Mitali does extensive fieldwork with computer scientists, law enforcement, and the bureaucrats/government officials who are making decisions about child safety.

Mitali calls this collaboration between NGOs and police “para-judicial policing.” In her fieldwork with a Dutch organization, Mitali is studying these collaborations. She shows us a video by the NGO Terres Des Hommes, who go undercover to manipulate a fake girl computer model to identify “webcam sex tourists” and hand them over to the police. Mitali has spent time with this organization and the partners they have in southeast Asia.

Sweetie, this generated avatar of a girl, was created by a gaming company for Terres Des Homes. Sweetie can do 14 different movements, including her arms. She does not undress on camera, does not do any kind of sexualized acts, is just sitting, and is able to talk and move her arms. This campaign was set up, working out of a warehouse (they were worried they would be found by the people they were chatting to). They went onto webcam chats, and then used the Sweetie image in a minority of cases. They brought the conversation to the point where it seemed like the man wanted something, took whatever identifying information they could, printed a physical packet of papers, and walked the list of names to Interpol and Europol.  Many law enforcement officers find this abhorrent and stupid. This is the work of the police, they say. NGO organizations describe this as a new and cutting edge model for the future of addressing these issues. Terres Des Hommes calls this “pro-active policing.”

When TDH submits these names, who’s actually arrested? The number is under 20, only people who had previous cases open. The sting operation can’t directly lead to an arrest.

Why a filipino child? After testing a variety of avatars, the company settled on her. The image is an amalgam of over 100 children that the organization works with. The organization has been working in the philippines for a long time.

Who is the organization trying to catch? Whenever there’s a non-law-enforcement effort, there’s already a pre-determined predator they’re trying to catch. The number one chatters of Sweetie were from the UK and US, but number 3 was India, and women also chatted with it. This was an unexpected outcome; they were expecting to catch European men.

Mitali also researches other visualization and imaging techniques for identifying and detecting “missing children.” She’s also interested in the “gamification of surveillance” and the use of this surveillance (whether photo tagging and image recognition or avatars) to carry out these “policing” activities.

Citing questions raised by her fieldwork, Mitali says, “I’m interested in feminist technologies, and interested in design and ending exploitation. “What is at stake in these issues? Do young people have rights? Do they have rights or sexual rights? What does it mean to talk about young people and sexual rights. Are young people’s sexual rights protected under the UN Convention for the Rights of the Child, and do law enforcement think about that? How do we think about risk, and do we see online spaces as spaces of opportunity? What is a problematic versus a dangerous situation? And finally, I’m thinking about governance and design: law enforcement, NGOs, computer scientists, and companies working together. What do these partnerships mean, who’s not at the table, and what might it mean to actually have young people involved in exploitation campaigns?” Mitali asks us to imagine speculative possibilities for ending exploitation and liberation that still uphold children’s rights.

Question: Sweetie was an amalgam of many real children. Did those children or those parents consent to this use of them? Mitali: many NGOs face this issue. Terres Des Homme works with many young people who don’t have parents or guardians. Their images were used without their consent, and the Philippines government complained, having felt targeted by this campaign. This idea of “webcam sex tourism” — which this organization coined — combines many complex ideas, and was a publicity campaign.

Question: Why did they generate computer generated children in pornographic situations? Mitali: child pornography is illegal in the US and legal in Japan, and are often met by challenges by the ACLU. In the US, we use the phrase “child pornography,” but in the EU, “child abuse images” and “child exploitation images” are the more common terms. The US has moved from a rehabilitative model to one that sets out to incarcerate people for life. As older crimes like public indecency are now tried under trafficking laws, these new laws are changing penalties for longer-standing issues.

Mary Gray: Many of these campaigns see the Internet as a “stranger-danger threat” when we know that most abuse comes from family and friends. Mitali: campaigns to address sexual exploitation tend to turn into censorship for all sexual information. What might it take to support young people to negotiate risks that they experience?

Question: You have people responding to an image that is false. How might this be considered a form of entrapment? What if people say, “I know this wasn’t a child- it didn’t look very real.”

David Larochelle: You mentioned that this was a publicity stunt. What was the organization hoping to accomplish aside from catching perpetrators? Was it trying to scare people? Raise money? Mitali: Definitely to raise money; that’s always the goal of any NGO. I think it’s more than a publicity campaign, however. They wanted to “wake up the police who weren’t doing anything.” The police said, “of course we’re always doing investigations, you just don’t hear about them.” This NGO and many organizations are reshaping themselves around this trafficking frame. Two years ago, they changed their tagline from “saving the children of the world” to “stop exploitation.” This is why I make the link to human trafficking and the anti-trafficking industry, where this is becoming their goal. Now, police are working closely with these organizations on Sweetie 2.0. The Dutch police are the number 1 employer in the Netherlands, were nationalized several years ago, and hired computer scientists and psychologists to work on their team for exploitation issues. The police psychologists responded, “if you want to believe that [sweetie] is a real child to you, it will be real enough.” What does “real enough” mean for policies around “implied,” “artificial,” and “cgi” forms of pornography.

Question: What about the effects of international organized crime? There are groups who are making a lot of money doing this, and police departments are involved because they get kickbacks. The speaker mentioned that when people tried to do work to end human trafficking, they received threats. Mitali: I don’t know too much about organized crime around trafficking, but this was a major concern of the NGO. They didn’t want this design process to get out, and they did their work from an undisclosed location. They now say, “I don’t know why we were so paranoid.” The traditional police’s fear about “proactive policing” is that

Mitali notes that Anonymous has done a lot of anti-trafficking work themselves. Operation “PedoChat” claimed to have outed a large number of people chatting with children and seeking sex online. Mitali notes that “I’m uncomfortable when we have so many entities involved in many kinds of policing. It’s this classic fear of ubiquitous surveillance. What are our fears about young people, and what happens”

Question by me: Having shared more complex cases, what directions do you find most promising in the sex trafficking space. Mitali tells us about an organization called “End child prostitution and trafficking,” and they’re interested in doing research on sexy selfies. For an NGO to be doing that kind of research is radical and maybe is thinking about inclusive design. To have organizations doing thinking about “child” and “sexuality” next to each other is a radical move.

David Larochelle: how does gender play into these debates? Mitali: with trafficking cases, it’s hard to get numbers and specific data, but some researchers in specific places have documented trafficking of boys and men, especially for sexual exploitation. When you use imagery that only shows women and children, you do a service in ignoring very real exploitation towards men and boys. Furthermore, the number one form of exploitation is of adult men and women in meat packing plants and farm work, but it’s easier and safer in the current US political context for organizations to focus on sex trafficking and women. When I talk about “carceral feminism,” we’re seeing “heavy policing”  life imprisonment as strong responses to these issues, with incentives like the “war on drugs.”

Question: Viscerally, these crimes of forcing someone to do something against their will feel pretty abhorrent. Where do you see law enforcement fitting in? Mitali: one way would be a child-centered approach rather than “finding the bad guys.” Instead, we might focus on supporting the people who are missing. When children are “rescued” by these campaigns, what happens to them? People who are not citizens of the country where they are rescued, they’re often deported. A child and rehabilitative approach would focus on finding exploited children and care for them long term.

Question: How much research have you done into the conditions of the children who were doing webcam chats in the Philippines? A serious discussion of their digital rights has to be understood in the context of their access. For example, Sonia Livingstone is arguing that any discussion of digital rights for children must include analysis of access; it’s easy for people in the Global North to assume similar access for children in the Global South. Mitali: I’ve done some research in Cebu, which is where this NGO works. Internet Cafes are common physical social spaces for people to play games and also cam. Terry Senft has done research with camgirls– and we need more work with children.

Question by me: I know you’ve published whitepapers and other reports together with Microsoft; how do you think about the role your work places in these issues. Mitali: I turn the lens on people in positions of power, doing ethnography of the police and methods of policing. Other researchers have looked at children, and cultural spaces of children’s sexuality. I hope that this work can help people think about the people in positions of power, something that STS is designed to do.

Mary Gray: How do the police feel about this being the “drain” of their focus versus other kinds of policing. Mitali: it depends on the funding of policing. When you have child exploitation centers in the police, it’s not a burden. But when NGOs get involved, they tend to feel like they have to clean up other organizations’ messes. They also can be concerned when other organizations press against what they see as their borders.

Mitali: As I write a report on “child safety,” I’m trying to find links to people who involve young people in design processes. Nathan refers to Roger Hart’s work on Children’s Participation. Mary Gray refers to Hasinoff’s book Sexting Panic.

Readers with further ideas and suggestions can reach Mitali on Twiter at @mitalithakor.

The Quantified Self; Newsfeed: Created by you?; Holding Crowds Accountable To The Public; EVE Online and World of Darkness

2015 MSR PhD Interns Speaking at the Berkman CenterToday at the Berkman Center, our summer PhD Interns gave a series of short talks describing our research and asking for feedback from the Berkman community. This liveblog summarizes the talks and the Q&A (special thanks to Willow Brugh for collaborating on this post).

Mary Gray, senior researcher at Microsoft Research, opened up the conversation by sharing more about the PhD internship. “We need folks who can do bridge work, who can work between university and industry settings.” Each students’ projects is taking a tack that is less common; there’s mostly a social-critical approach. Our group is particularly focused on showing the value of showing the value of methodologies are less familiar in industry settings. It’s a twelve-week program, and it doesn’t always happen in the summer. “We’re always interested in people who want to take a more critical/qualitative approach. We have labs all over the world, and each lab accepts up to 100 PhD students to do this kind of work,” Mary says.

Microsoft research is (sadly) unique in that everything a student does is open for public consumption, says Mary. PhD students are encouraged to do work that feeds academic conversations while also potentially connecting with product groups that could benefit from that insight.

Quantified Self: The Hidden Costs of Knowledge

What are the privacy ramifications of our voracious appetite for data, what are the challenges of interpreting it, and how might data be employed to widen inequality? Ifeoma Ajunwa is a 5th year PhD candidate in Sociology at Columbia University. Recurring themes in her research include inequality, data discrimination and emerging bioethics debates arising from the exploitation of Big Data.

“Almost everything we do generates data,” Ifeoma quotes Gary Wolf’s WIRED Magazine article on quantified self. And yet this kind of data collection can be a form of surveillance; companies can also often crawl this data from the Internet and use it to feed algorithms that influence our lives. In this backdrop, people are also collecting data about themselves through the Quantified Self movement– data that could also be captured by these companies and used for purposes beyond our consent.

How can our data be used against us? Kate Crawford noted in a recent Atlantic article that this data has been used in courtrooms. Ifeoma also expresses worries that this data could be used against people as companies use it to limit their own risk. The “quantified self” has a dual meaning. On one hand, it refers to the self knowledge that comes from that data. On the other, this idea could turn against people as institutions set policies based on that data that widen inequality.

Ifeoma describes criminal records as a “modern day scarlet letter” that ensures that people are omitted from opportunities. In genetics, Ifeoma describes the idea of “genetic coercion.”

In her summer research with Kate Crawford at MSR, Ifeoma is looking at the quantification of work. Unlike Taylorism, where the focus was on breaking down the job task itself, the focus now is on “the individual worker’s body” and “inducing the worker to master their own body for the benefit for the corporation.” In this “surveillance-innovation complex,” companies try to evade regulation by seeking protections for innovation. They’re looking specifically at workplace health programs that include health, diet, exercise. These programs track weight, spending habits, etc. Ifeoma is looking at what companies track and how the interpretation of this data can impact the workers it’s generated from.

She concludes by asking us, “How can we make technology work for us, rather than against us? Could we harness large and small data without it increasing divides and discrimination?

News Feed: Created By You?

How do people enact privacy? Stacy Blasiola usually asks in her research. When you’re posting something online while at a bar, are you thinking about who sees it? This focus misses out on the role that platforms play in this work, a focus she’s taking on at MSR.

Stacy Blasiola is a PhD candidate at the University of Illinois at Chicago and a National Science Foundation IGERT Fellow in Electronic Security and Privacy.

This summer, Stacy will be looking at the Facebook NewsFeed algorithm. She talks about the Facebook Tips page, where Facebook provides information on how to find out who your friends are and how the NewsFeed works. Stacy shows us several videos that they’ve posted under “NewsFeed: created by you”.  These videos were promoted by Facebook across their users, and they received millions of video views.

Tim: “I made my News Feed about wellness, nutrition and living my best.” Create a News Feed that inspires you.

Posted by Facebook Tips on Tuesday, December 2, 2014

Stacy has been looking at the relationship between the videos and the comments… “Surrounding myself with… knowledge and expertise. I want to know what you know.” “I look forward to seeing my best self every day.”

According to Facebook, Tim is solely responsible for what he sees in his feed. Stacy has been looking at the discourse used by users and platforms to ask, “how do the platforms matter to the users?” When users commented on these videos, Facebook often posted official comments.

One user said: “This leads me to believe I have control over my own feed. I don’t. FB is constantly making things disappear and rearranging the timeline.”

The company’s response changes depending on the type of questions asked. For example, “Why do I keep getting old posts? Well, people are posting a lot on it, so it resurfaces”… Facebook uses linguistic gymnastics to not say “we’re doing this.”

  • Stacy is at the very beginning stages of this project, and hopes to carry out the following kinds of analysis:
  • How do users discuss the news feed algorithm?
  • How does Facebook position the news feed to these users? Especially, where do they place responsibility?
  • How do users talk about the news feed to each other?

What Does It Mean to Hold Crowds Accountable To The Public?

Nathan Matias is a PhD Candidate at the MIT Center for Civic Media/MIT Media Lab, and a Berkman fellow, where he designs and researches civic technologies for cooperation and expression.

I was onstage at this point, but here’s a basic summary of the talk. After posing the question “How do we hold crowds accountable to the public?” I described common mechanisms that we imagine as forms of accountability: pressure campaigns, boycots, and elections, legislation, etc. I then described three cases where these mechanisms seemed unable to address forms of collective power we see online:

In the case of Peer Production, people sometimes petition Jimmy Wales, somehow believing that he has the power to change things. Other times, op ed writers make public appeals to “Wikipedia” or “Wikipedians” to address some systematic problem. I described my work with Sophie Diehl on Passing On, a system that uses infographics to appeal to public disappointment and then channels that disappointment into productive change on Wikipedia (more in this article).

In the case of Social Networks, we sometimes criticize companies for things that are also partly attributable to who we accept as friends or what we personally choose. This debate is especially strong in discussions over information diversity. I shared an example from Facebook’s recent study on exposure to diverse information, outlining their attempt to differentiate between available media sources, friend recommendations, personal choices, and the NewsFeed algorithm. I also described my work with Sarah Szalavitz on FollowBias, a system for measuring and addressing these more social influences..

Finally, I described work on distributed decisionmaking, whether the decisions of digital laborers who do the work of content moderation online. I described my recent collaboration on a research project describing the process of reviewing, reporting, and responding to harassment online. I also described upcoming work to study the work of moderators on Reddit.

How Do Gaming Communities Make Sense of Their Personal Place and Role in Massive Worlds?

What is it like to be an EVE Online player? Aleena opens up by showing us videos of massive online space battles in this massively online multiplayer game.

Aleena Chia is a Ph.D. Candidate in Communication and Culture at Indiana University currently interning at Microsoft Research, where she investigates the affective politics and moral economics of participatory culture, in the context of digital and live-action game worlds.

Aleena’s research on consumer culture tends to focus on gaming activities. How they make sense of their role in these massive worlds? Her argument is that users make sense of their experience through the alignment of spectacle, alcohol, and experience at brand fests and the gameplay experience itself. They feel that they’re truly a part of something larger than themselves.

How do they make sense of their hours spent on this? They spend hours and hours each week building up empires. This experience is made sensible through reward and reputation systems, sometimes designed by the companies, and sometimes by the communities themselves. How do players make sense of the time they’ve invested into their identities as gamers in Eve but also beyond? They make sense of this through conversations about work-life balance, as well as the recognition by others that their work has cultural, economic, and social value.

At the heart of this are <strong>Compensatory drives</strong> – use things to add up and even out. Get what’s coming to them. (Re)compense is connected to an idea of balance, a moral equilibrium. These “compensatory forces” give people a connection to the intangible world, to have a sense of fairness and justice, and a sense of aesthetic, economic, and social legitimacy.

EVE Online is a hypercapitalist world with no governments — warfare, murder, and theft are sanctioned if you can get away with it. But there is also a democratically elected player council, consultants to the game company, who talk to developers and the company. The savage world is managed through civilized processes.

Can player representatives effectively consult with the company? Players have very micro concerns, while developers often have macro-level concerns for all the players. Within these systems, there are some mechanisms of accountability — if they don’t do well, they won’t get elected. Players often complain to them on forums, email, and at meet-ups. But they also don’t have much power.

To understand this, Aleena will be looking at minutes from meetings between the council members and the company, as well as the council members and the player base. She’ll also be looking at town hall meeting logs, election campaigns materials and responses. She’ll be asking, how do they see their roles in relationship to each other? She’ll also look at how players learn to be council members of terms of office by examining meeting minutes. Finally, Aleena will be mapping feedback channels, mechanisms, directions, and ruptures– both formal and informal mechanisms. Feedback doesn’t just run up the chain from players to developers through the consultants; it also runs down. Consultants have a job (either implicit or explicit) to advocate for the company and “spread goodwill to the masses.”

If the election of player councils is one example of a democratic process between audiences and brands (perhaps related to reality tv shows with audience feedback. Now we have tribunals). Is this market populism (neoliberalism at work, a replacement of authentic democratic engagement)? Might it instead be consumer co-creation (customer relations, commoditized into a pleasurable and branded experience) – not just about making the experience better, but your experience as a consumer). Lastly, designers often say that users don’t know what they want — discounts popular will.

Finally Aleena is asking, “how are these democratic mechanisms changing the means and meanings of consumption?”

Questions, Answers, Comments

Ethan : How crowds develop or are simply different from users or patients or…? Wikipedia has a crowd, but how do you distinguish from other groups.

  • Nathan: You’ve likely thought of this in your dispute resolution research. We might think of individuals or institutions. Using “crowd” as a placeholder for something we don’t quite know where to apply the lever to change things. Cumulative effect of the social choices and friendships we have in a network. Or it might be more identifiable.

Rebecca : What is your model of genuine civic engagement which neoliberalism has surplanted?

  • Aleena: My utopia is a participatory democracy. But my own is not official political systems, but how can the media open up space for the public to participate? Engagement with the media via certain mechanisms creates real decision making power in the system and the content.

Tarleton: How do people think about their role in relation to community. But the world is meant to be something– Eve is clear about this, as is WIkipedia, and Facebook is sort of getting there. Not just governance problems, but the narrative claim of the institution masking or distorting the style of engagement?

  • Stacy: My project stacked against Nathan’s shows two different aspects of the same problem. “Algorithm” as a single thing to be tweaked to fix everything. But that is not something I know. Transparency is seen as something severely lacking. How does Facebook present in order to shape the reality. “We” in publicity, “you” in user interactions. Depends on the audiences they’re speaking to.
  • Nathan: I draw inspiration from Hochschild’s research on airline attendants. There was a clear corporate brand identity concern influencing how how airline attendants were trained to respond to tough situations. Training wasn’t just about what to do, but how to be. Like Hochschild’, I’m also looking at the process of learning to be a worker. There are job boards on Reddit where people apply and chat. Reddit has basic rules overall, but each /r also has special rules. I’m looking at how moderators look at their roles in their /r as well as at Reddit.com
  • Aleena: The word between corporate and users — classic customer feedback, filter them, see what makes sense, incorporate. But we also want to persuade users that we’re on the same page, no “us vs them.” It’s not just about the bottom line — want there to be engagement. Eve doesn’t just want you to be happy. They want you to strive and have troubles.
  • Ifeoma: Are governance of wellness programs actually voluntary? People aren’t voting about the shape the wellness program will take, only that it will exist. It’s about shifting the responsibility onto the individual worker. No real discussion if the work infrastructure can be shaped to achieve the same thing. Corporation abdicating its responsibility for a healthier worker, putting it on the worker. We’re worried about what that means, in the structural constraints inhibiting them. What will the new workplace discrimination be? It’s perfectly ok for your employer to fire you if you’re a smoker outside the workplace. Level of coercion. Up to 30% or 50% of the program covered for smoker cessation.

Mary Gray: Across these talks, there has been an implicit appeal that a social need or desire work outside of market demands. Want to keep players playing, keep newsfeed functioning in a certain way, etc. Market demands the corporations do something external to what the players/users/etc want. Why are we seeking corporate good? What sends us to the corporation to fix these things, seeing them as the path of recourse?

  • Stacy: Inherent expectations from users that Facebook be “truthful.” Christian Sandvig did work that shows users feel confused, lied to “I thought that person just didn’t like me.” Users rely on FB to maintain social connections. When those assumptions aren’t met, tehre is anger. Comments of “why are you doing this to me?” Other people say “I have to be friends with someone because of business or whatever, but I don’t want to see their posts.” Gatekeeping is not new — but we don’t know how they’re doing this process.
  • Aleena: Players look to Eve for social justice because the company thinks it makes good sense. Difference between votes at face value and adapting to mass player will. Do have to come up with something new, even if it’s not the thing that was asked for.
  • Nathan: Wikipedia and Reddit are sorts of counterexamples to Eve or Facebook in that participants and active contributors may feel that it is a public good. Wikimedia, as a nonprofit, is funded by donations, has elected board members, and can be thought of as accountable to its participants. But when people who are not contributors are affected by its power, they may take traditional routes (such as petitioning Jimmy Wales). Reddit is more complex. When Reddit started banning /r for specific behavior they saw instead of its traditional hands-off model, this question of interests started to crack. Advertising and Reddit Gold are perhaps competing income models — when a fundraising goal is met, they buy a new server. But Reddit the company also is starting to take more top-down responsibility for what its users do, which makes them look more like other corporations.
  • Ifeoma: Relinquishing the rights of social good to corporations has to do with complex problems and simple solutions. Wellness programs try to address American lifestyles being unhealthy (sitting, eating, smoking, don’t work out) — which is complex both in lifestyle and in infrastructure. Trying to fix with something as simple as a wellness program won’t have the intended results. AND has unintended results (discrimination especially). Laws don’t cover obesity or smoking, which are stigmatized, and encroach on the rights of the worker.

Nick Seaver: What is happening to audiences, citizens, workforces, etc as different publics — are you helping to defining what each of those things means? My own work has been undermined because I didn’t define that.

  • Aleena: What is the value of comparing it to things which have come in the past? Something IS different in connected space. Not just how democracy and society are changing, but how are the meanings of consumption changing? Video games even… you can do so many more things — you’re supposed to buy things and make friends and etc. “What kind of player are you?” Identities are tied to this. How does this jigsaw piece connect to the rest of it?
  • Ifeoma: Historical context is really important, especially in defining what a worker is. A defining thing is technology being available. Workplace surveillance isn’t new.  What is new are the advances in technology, letting us survey and track the worker in a way which wasn’t already available. Collapses the line between work and no work. Woman fired (and sued) for deleting an app on her phone — which couldn’t be turned off, and was tracking her (how fast she was driving, where she was at over the weekend, etc). Unforeseen issues — need to redefine what it means to be a worker.
  • Nathan: Kevin Driscoll and I have this debate about BBSs. We talk about them with a set goal and solid definitions. People do the same for Twitter and Reddit etc. And yet, in moderation work, there are common experiences, expectations, and tools. These moderators have to figure out how to work at that intersection between what a company with <100 employees are defining as the space of their work. Postigo explored this somewhat in his research on AOL community leaders — the more AOL did to control, track, standardize community leaders’ work, the more those folk thought of themselves as a collective and like unpaid workers. So I’m still looking for the language and theories to describe this.
  • Stacy: I’m interested in the idea of When Technologies Were New — how people interacted with tech when things were new. The more we do with research, the more we realize nothing is new. In that sense, my dissertation isn’t just about Facebook, about algorithms at large. Society from a larger view — how do we understand the mediations happening?

Recognizing the Work of Reddit’s Moderators: Summer Research Project

What does it take to keep online communities going? With over 550,000 public subreddits, many of which are active, the communities on the site rely on ongoing effort by a large number of volunteer moderators. In my research, I’ve made the case that caring for the communities we’re part of is an important kind of digital citizenship. For that reason, I’m excited to learn more from redditors about how they see the work of moderation, why they do it, and what is/isn’t their job.

This spring, I’ve been reading extensively about digital labor and citizenship online, including the story of over 30,000 AOL community leaders who facilitated online communities in the 90s. With Reddit pushing for profitability and promising new policies on online harassment, I thought that potential tensions arising this summer might offer an important lens into the work of moderators, at a time when listening to mods and recognizing their work would be especially important. “The summer is likely to include substantial discussion and introspection on the nature and boundaries of moderation work on Reddit,” I wrote in my proposal mid-May.

Although I expected something, I didn’t expect that Reddit would ban a set of subreddits and mods in their attempt to carry out their new policies, or that some redditors would vigorously oppose this move. (Update July 6: I also didn’t anticipate that reddit moderators would take their subs private to advocate for changes in how they are treated). These controversies have convinced me that this research could be especially valuable right now. Press coverage is likely to focus primarily on the controversy, while I can carry out a summer-long project, in conversation with a wider sample of redditors than just those associated with this controversy.


In this post (which I will be sharing with redditors when I ask permission to speak with them) I outline my research to understand how Reddit’s moderators see and define what they do. This blog post includes details of the research, the promises I make to redditors, and the wider reasons for this project.

About This Research Project

I’m a PhD student at the MIT Media Lab / Center for Civic Media and a fellow at Harvard’s Berkman Center for Internet and Society where I research civic life online. As a PhD intern at Microsoft Research, I get to be supported this summer by amazing researchers including Tarleton Gillespie, Mary Gray, and Nancy Baym, who are advising this project.To learn more about my work you can read my MIT blog or check out my portfolio.

Over the next few months, I’ll be:

  • Hanging out in moderator subreddits like needamod, modhelp, and others to learn more about how mods find opportunities, learn the ropes, and discuss their work
  • Posting questions to some subreddits, after seeking permission from the mods, asking questions or getting feedback on my working understandings
  • Collecting basic summary statistics across Reddit, from public information, to understand, on average, how many mods there are (like the above chart) and what kinds of rules different subreddits have.
  • (potentially) interviewing reddit mods
  • (potentially) trying my hand as a moderator

Ethics: Who’s This For, What am I Recording, and What am I Sharing?

My summer project is being done at Microsoft Research’s Social Media Collective, where I am a PhD intern. At MSR, I have the intellectual freedom to ask questions that are widely important to society and scholarship. I also expect to make my research widely accessible. Microsoft open-sourced my code when I was an intern in 2013, and Microsoft Research has an open access policy for its research.

Although I am a fellow at the DERP Institute and can, in theory, start a conversation with Reddit employees, I have not discussed this project with Reddit at all, have never received compensation from Reddit, nor am I working for them in any way. While it is possible that I may be asked in the future to share my results with the company, I will not share any of my notes or data with Reddit beyond the findings that I publish in research papers, public talks, blog posts, or open source materials.

This isn’t the first time I’ve done research about the work of moderation from the outside a powerful company. Last month, my colleagues and I published a report on Reporting, Reviewing, and Responding to Harassment on Twitter, including a section on the work of moderating alleged harassment. In that study, we treated everyone in our study with respect, including alleged harassers. Our research team did not share data with the company, we were writing independently of Twitter, and we had full editorial control over our report, even from the commissioning organization WAM!. Likewise, in my 2013 summer research at Microsoft on local community blogging, we either summarized or anonymized/modified all quotes and photos before publishing our results.

In this project, I promise that:

  • Anyone can opt out of this research at any time by contacting me at /user/natematias. If you opt out, I will avoid quoting or mentioning you in any way in the published results.
  • By default, I will anonymize any information I collect before publishing
  • If a user requests that I use their username to give them appropriate credit for their work, I’ll weigh the risk/benefits and try to do right by the user
  • I will keep all my notes and data secured, with secure backups that I access through encrypted connections.

Why Do Research with Redditors?

Reddit is one of the few major public platforms on the English-language web that allows/expects its users to establish and maintain their own communities, without thousands of paid content moderators and algorithms behind the scenes deciding what to keep or delete. In contrast, the Huffington Post pre-moderates 450,000 comments per day, paying between $0.005 and $0.25 for every comment that comes in. Yet Reddit mods do so much more than just delete spam. They do a huge amount of important work to create new communities, recruit participants, post content, manage subreddit settings & style, recruit new moderators, set rules for their subreddit, and monitor/manage submissions and comments. Moderators also tend to play a large role in debating and establishing wider community norms like Rediquette.

Last week, I used the Reddit API to collect data on the number of moderators who keep subreddit’s conversations going. A random sample of 100,615 subreddits (roughly 1/6 of all public subreddits) had 91,563 user accounts as moderators. While not all of these subreddits are active, each of them represents a moment of interest to try on the role. Among the 46% of subreddits with more than one subscriber, 30% of these subreddits have two or more moderators.


Communities of redditors and mods shaped some of my earliest impressions of the site six years ago, when a work colleague in invited me to join Reddit London meetups, telling me stories about their weekend and after-work gatherings. It was clear that participation meant more to many redditors than just links and comments. Later on, when I took two years to facilitate @1book140, The Atlantic’s Twitter book club, with around 140,000 subscribers, I came to learn how challenging and rewarding it can be to support a large discussion group online.

How Am I Going to Go About This Research?

Computer scientists, economists, and designers often want to ask if offering the right upvoting system or the right set of badges will filter content effectively or motivate people to contribute the greatest amount of appropriate effort to a web platform. This focus on productivity often interprets the activity of users in the language of company priorities rather than community ones. Stuart Geiger and I discussed this idea of productivity last fall at HCOMP Citizen-X, arguing that we need to understand user’s values beyond just the “productivity” of a group.

Although I often explore questions through design and data analysis, I’m taking a different approach this summer, to better understand how redditors see their own participation. My first semester at MIT taught me how important it can be to participate and observe a community rather than just measure it. Rather than spend the whole summer data-mining the Reddit API, I’m participating in subreddits and speaking to redditors. In “The Logic and Aims of Qualitative Research,” a chapter in a larger collection on communications research methods, Christians and Carey say that when researchers ask questions about human life, “we are examining a creative process whereby people produce and maintain forms of life and society and and systems of meaning and value.” They argue that qualitative research sets out to “better understand the meanings that people use to guide their activities” (358-9).

As a student in MIT’s Technologies for Creative Learning class, I was curious about how young people learning to code thought about “bugs” in the stories, art, and games they made with Scratch. In a corporate environment, where there’s a goal for everyone’s work, it’s possible to define software errors. But does the same language apply to a ten-year-old child who’s creating a story after school? Most scholarly discussion of “bugs” applied this corporate term to young people, defining strict goals for students and measuring “errors” when they diverged from pre-defined projects. When I visited schools, observed student projects, and talked to students, I saw that diverging from the teacher’s plan could be a highly creative act. Far from an error, a “glitch” could prompt new creative directions, and an “unexpected surprise” often opened learners to new understandings about code.


Code and artwork from one of my first projects on Scratch.

If I had relied entirely on the definitions and data coming from teachers or the Scratch platform, I might have been able to test statistical hypotheses about “bugs,” and I might even have developed ways to limit the number of errors per student. I would never have noticed how important these unexpected surprises were to young people’s creativity, and at worst, I might even have reduced the chances of students to experience them. By participating with students and spending time in their learning environment, I was able find new language, like “glitch,” that might move conversations beyond “errors” or “bugs.”

For my Reddit study this summer, I want to hear directly from mods about how they see their work; questions that go well beyond what can be measured. Many thanks in advance to those who welcome me into your subreddits this summer and take time to talk with me.

Update July 6, 2015. I ran into a Reddit employee at a conference last week and sent them this link, so the company is now aware of this project. I am still not working directly with Reddit in any way.