Analysis of the Stackoverflow moderators problem

(A version with better formatting can be found here)

So, the recent Stack Exchange podcast mentioned that the lion’s share of the moderation of is being done by one person. That got me thinking about some of the things which have been said on past podcasts and how the site is really representative of an incredibly active (and minuscule) minority serving a significantly more passive majority.

Realistically, the structure of Stack  is a pyramid.

At the very top are Joel, Jeff, and the people working for the Stack Exchange (SE) company. They have the authority of life and death of this project. Based on the interactions in (the site where all of the feature decisions about all of the SE sites are made) it is obvious that Jeff maintains creative control of what features go into and do not go into the site. Frankly (and more importantly), if they wanted to pack the whole thing up and go home tomorrow, they can. It is almost an Aristotelian thought: ultimate power over a thing comes in the ability to destroy the thing[1].

The next level which can be found in the power hierarchy are the network of moderators. The chief power these individuals have is their ability to quickly and easily remove people and data from the site. That might not seem like much, but it is really the only thing that they can do. They do not have some magical “super-voting” ability which gives people 5-10 times the points. They do not have substantially more ability to lock posts or delete posts even. They can just do it faster.

The only truly unique abilities these people have is the ability to censure users and they have had the need for consensus removed. (Yes, I am aware that they can see more statistics about users, but that really isn’t quite on the same level).  This, I think, is a point worth noting. The only things which are solely in the purview of the moderators is user management.

This is followed by the fifteen tiers of user (seen at right). The basic idea is that in order to do more on the site you either need to get a large number of up-votes or be elected moderator (or create an evil robot clone of Jeff Atwood or Joel Spolsky and secretly control it via remote). In addition to the feedback provided by the “game” of getting up-votes, these pieces of reputation serve as little prizes along the way.

This system works well enough. As can be seen by the current statistics (left), there are 58,917 ranked users[2] and while that might seem like a lot it really only amounts to 7% of the registered users on the site. If compared to the vast numbers who land on SO after an answer comes up on Google, this number starts getting much closer to ≤ 1%[3].

There has been a lot done over the years to try to address this imbalance, and to pull people in from the void. A number of features have been added to the site to make it so that even the anonymous users can do things like ask questions and suggest edits. Instead of using an outside service to authorize a user, SE can now serve as your OpenID provider.

And the site continues to grow. You might notice that the numbers to the right on the chart are all positive (and green), meaning that the number of people involved is increasing (in this case, the numbers reflect the past year).

So here comes the flaw. There are not enough people handling the moderator flags, so that the bulk (if not majority) of the work is being done by one person and the number of flagged items can, at times, get huge (at the time of the podcast there were over two-hundred items which needed moderator attention). This isn’t to say that there are insufficient moderators, but rather that the work which they had thought would be done by many was being done by one.

There are a few things which have been done in an effort to minimize this problem. First, there was the creation of the flag weight “game”. The idea here being that if you produce really helpful flags, then not only do your flags receive priority in the queue, you also get a nifty little “flag weight” metric on your profile page which shows that, in fact, you have been helpful while flagging. More recently (as in the last few months), the statistics on which moderators are doing what were published which means that it is much easier to see that one person is doing half of the work.

But, the big question is why is it that SO has this problem but the other SE sites do not? Here are some possible outlooks and solutions.

Part of the reason that this problem exists, at least by my guess, is that as the number of users increase, the amount of work does not increase linearly. This has to do with issues similar to those raised in Qiaochu Yuan’s power-laws post (source of the graph to the left). (I could be completely off the mark here, but my experience with similar systems is that you’re talking about O(nlogn) or O(n2) type of growth rates)

A major concern should be bystander effect: probability of action varies inversely with the number of people involved. In a situation where there are fewer moderators, each one is more likely to act as there is less of a feeling of “someone else will do this.” If this is the case, then the answer might actually be fewer moderators, as strange as that sounds.

There is the problem of work division. A task which is viewed as large or difficult will generally engender more procrastination or, if the task is optional, outright refusal and non-compliance. On the other hand, if a task is seen as trivially simple, it will often be picked off quickly and easily. A step which might ease the imbalance is work distribution — alert moderators of only a handful of flags at a given time, but alert them of different flags. Assigning specific flags to specific individuals will also work to minimize the bystander effect.

It was proposed that additional information might prove useful as a tool to help make the job of the moderator easier. This could actually mean two things, both of which are beneficial. First, it could mean training moderators at first, thereby increasing the amount that moderators have invested and making them more interested in making sure it is done well. Second, it could mean that users would be forced to add feedback as to why a flag is created. This would mean that the signal-to-noise ratio would shift further in favor of the moderators.

I, personally, suggested that it might be a good idea to consider giving the non-moderator users who have sufficient privilege the authority to act more like moderators. The benefit to this would be that it would mean that flags could be resolved by anyone with the reputation and inclination. I admit, I am not familiar with how difficult this might be to implement.

I think it goes without saying that neither Joel nor Jeff will support the idea of bringing in paid moderators. While it is inevitable that a large enough sociological ecosystem will eventually need to have dedicated maintainers (professional firemen/paramedics will generally replace amateurs and volunteers as the serviced area increases), hiring a paid monitor to work on a non-meta site could be seen as a major blow to community involvement.

Gamification, although that is SE’s bread and butter[4], does not seem like it would be a viable solution here. I mean, can you think of what the game might be? First one to resolve one-hundred flags correctly wins! Talk about unintended consequences.

Games are not fun because they’re games, but when they are well-designed[5]

All told, I think that this is a minor issue for the moment. In a worst-case scenario, I firmly believe that if our angel moderator were to disappear tomorrow, there would be sufficient interest in SO to make sure that it maintained.

That is, to be honest, the most important, and most comforting, point here. The community is invested. Those of us who have gotten to the highest tiers really want the site to succeed, and we really believe in it, to the point where if the entire moderator team disappeared a good number of us would be willing to step in and fix our little corner of the ‘net. This is another case where Jeff and Joel are seeking out where the next problem might be and moving to solve it before it starts to hinder SE at large. It isn’t “the squeaky wheel gets the grease”, it is, “I think this wheel is going to squeak. How can we stop that?” It will be interesting to see the solution to this one.

[1]Actually, I believe Aristotle’s formulation was the exact opposite: ultimate power lay in the ability to create — cf. Ethics IX

[2]Users below 200 reputation are not counted as a part of the leagues. 200 is also the threshold where the ads disappear. This does have some implications as to how one separates the different classes of user, it might be better to say that there are ten levels of user and discount everything below 200, but that is not the point of the article.

[3]Occupy Wall Street comments not welcome at this time.

[4]And you know what they say about people with hammers…

[5] Sebastian Deterding

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>