Moderator Solidarity on Reddit: Predicting Participation in the Blackout of July 2015
For the last 40 years or more, online platforms have relied on people to facilitate and support our online communities. In the early 70s, they were the librarians and shopkeepers of Community Memory. In the 80s, they were the WELL’s “conference hosts.” In the 90s, they were AOL’s “community leaders.” In 2015, they are Wikipedia’s “administrators,” Facebook’s “admins,” Slashdot’s “moderators,” or XBOX’s “enforcement united.” And on platforms like Twitter without moderators, we find the need to invent them. These moderators are the founders, designers, promoters, facilitators recruiters, legislators, responders, and enforcers of our online social interactions.
This summer, I’ve been doing qualitative research on ways that Reddit moderators develop common interests as they face the company, as they face their subscribers, and as they relate to other moderators. Just in the top 20,000 subreddits by subscribers, Reddit has 50,790 moderators. This July, moderators of 2,278 subreddits joined a “blackout,” demanding better communication and improved moderator tools. The blackout is one moment in the wider research I’m doing, a moment where tensions and common cause rose to the surface. Blacked-out subreddits constituted 60% of the top 10 subreddits, 29% of the top 100, and 5% of the top 20,000 subreddits, representing a total of 134.8 million combined subscriptions.
Since I can only get so far by reading Reddit threads, I’m now interviewing Reddit moderators to learn more about your experience as a moderator and your experience of the blackout. If you are interested to talk, please message me on Reddit at /u/natematias.
Work In Progress: Charting the Reddit Blackout of July 2015
Since I’m also a software engineer and quantitative researcher, I’ve been complementing my qualitative work with data analysis on what I was able to collect from the public API, combined with /u/GoldTesting’s dataset of blackout participation. Mostly, I’ve used that data to decide where to look and who to reach out to. The conversations I found led me to think about several hypotheses I could also test statistically:
When moderators discussed the blackout with their subscribers, many debated the idea of “solidarity,” wondering if they were too small to have common cause with larger subs or if they were too small to make a difference. Others expressed strong opinions that joining the blackout meant standing with other moderators or standing for Reddit users as a whole.
The conversations I found led me to think about several hypotheses I could test statistically:
H1: Larger subreddits were more likely to join the blackout, maybe because their moderators were part of ModTalk, where much of the blackout was discussed, or because they felt a blackout would make a difference, or because they felt common cause with other mods of large subs.
H2: Subreddits with more moderators were more likely to join the blackout, perhaps mods in these subs would have greater solidarity with others.
H3: Subreddits with mods who also moderate other subreddits that participated in the blackout were more likely to join the blackout
To illustrate the data used for my statistical tests, here are two network graphs of shared moderators between subreddits. The first graph includes the top 20,000 subreddits in terms of subscribers (as of mid-June 2015). The graph one filters only subreddits with more than 10,128 subscribers. In the network graphs, subreddits that did not black out are tinted blue, while yellow-tinted subreddits joined the blackout.
The charts are laid out using the ForceAtlas2 layout on Gephi, which has separated out some of the more prominent subreddit networks, including the ImaginaryNetwork, the “SFW Porn” Network, and toward the center, the ShitRedditSays “fempire”. These networks are notable because some of them made network-wide decisions about their participation in the blackout.
Using this dataset, I conducted a logistic regression testing the above hypotheses.
H1: Larger subreddits were more likely to join the blackout. This hypothesis is supported. On average in the population of top 20k subreddits, there is a large positive relationship between the log-transformed subscriber count and a subreddit’s probability of joining the blackout, holding all else constant.
H2: Subreddits with more moderators were more likely to join the blackout. This hypothesis is supported, very very weakly. I wouldn’t make much of this.
H3: Subreddits with mods who also moderate other subreddits that participated in the blackout were more likely to join the blackout. This is supported. On average in the top 20,000 subreddits, there is a positive relationship between the log of moderator roles in other blackout subs and a subreddit’s probability of joining the blackout, a relationship that is mediated by the overall number of moderators shared with other subs, holding all else constant.
So, is there evidence of moderator “solidarity” ? Yes, if we consider H1 to be a test of solidarity associated with similar subscriber numbers, and if we consider H2 to be a test of solidarity related to the number of moderators one works together with, then yes, we see support for the solidarity hypothesis. However, my qualitative research shows that many subreddits voted on this issue, indicating that subscribers also matter to this picture. Furthermore, many mods of smaller subs also expressed solidarity, even if smaller subs were less likely to participate. So more work needs to be done.
CAVEATS: This is just a preliminary statistical test. I have much more work to do before publication:
- I need to define better hypotheses that can answer theoretically-meaningful questions
- I need to do much more work to confirm the validity of my data collection, data processing, and models
- I need better definitions of “solidarity”
- This needs to be peer reviewed
In particular, I plan to spend more time with network scientists to understand the best way to set up my dataset for statistical analysis. There are many ways to project a complex network onto a single table for statistical tests, and I may need to try a different approach. Note also that this model does not include time as a factor, and that I use the term “predict” to refer to statistical inference rather than some ability to predict participation in the blackout before it occurred.
I’m sharing these preliminary results because I hope they’ll attract interest from Reddit moderators, and hopefully lead me to more interviews and data while I still have time to talk to people and enrich my understanding of what happened. If you are a Reddit moderator and want to talk with me, please message me at /u/natematias.