Edit 2: Oh God Oh God 4chan has Robot9000. soup /r9k/. Have fun with the bot and do one last barrel roll for me.
Edit: As expected, with the huge flood of new traffic after this post went up, the channel is full of new folks coming in and playing with the bot. This is unavoidable and expected for these first few days, and ROBOT9000 is actually controlling the noise pretty well. Still, #xkcd-signal is a social channel — if you just want to play games with the moderator/concept, please use #moderator-sandbox. Thanks!
#xkcd has had about 250 chatters these days. Large communities suck. This problem is hard to solve, but we’ve come up with a fun attack on it — enforced originality (in a very narrow sense). My friend zigdon and I have put together an auto-moderation system in an experimental channel, #xkcd-signal, and it seems to work well, so we invite you all to take part.
When social communities grow past a certain point (Dunbar’s Number?), they start to suck. Be they sororities or IRC channels, there’s a point where they get big enough that nobody knows everybody anymore. The community becomes overwhelmed with noise from various small cliques and floods of obnoxious people and the signal-to-noise ratio eventually drops to near-zero — no signal, just noise. This has happened to every channel I’ve been on that started small and slowly got big.
There are a couple of standard ways to deal with this, and each one has problems. Here’s an outline of the major approaches (skip down if you just want to read about ROBOT9000):
- Strict entry requirements: This is the secret club/sorority approach. You can vet every new person before they’re allowed to speak. This sucks. It reminds me of Feynman’s comment on resigning the National Academy of Sciences — he said that he saw no point in belonging to an organization that spent most of its time deciding who to let in. The problems are apparent during sorority rush week on college campuses. Not only is the question of who does the vetting (and how) difficult, but the drama reaches horrifying levels as bitter counter-cliques rise up and do battle.
- Moderators: This is the approach IRC channels and forums usually take. You designate a few ‘good’ people who can deal with noise as it happens, by muting, kicking, banning, or editing content as need be. There are a couple problems here — the circle of moderators has to grow with the community. It eventually becomes fairly large, with complicated dynamics of its own, and the process of choosing moderators leads to sorority/NSF-esque drama and general obnoxiousness. I don’t like the elitism that inevitably develops, and prefer more egalitarian systems.
- Running peer-moderation: When it’s possible, this is a good approach. It’s used to great effect on comment threads, with Slashdot pioneering the whole thing and sites like reddit stripping it down to an effective core. But it doesn’t work very well for live time-dependent things like IRC channels.
- Splinter communities: This has happened on most IRC channels I’ve been on — small invite-only side channels sprout up with particular focuses. Often, the older core members of the community go off to create their own high-signal channel, which is generally kept quiet. But this is limited — it lacks the open mixing of the internet that often makes online communities work.
I was trying to decide what made a channel consistently enjoyable. A common factor in my favorite hangouts seemed to be a focus on original and unpredictable content on each line. It didn’t necessarily need to be useful, just interesting. I started trying to think of ways to encourage this.
And then I had an idea — what if you were only allowed to say sentences that had never been said before, ever? A bot with access to the full channel logs could kick you out when you repeated something that had already been said. There would be no “all your base are belong to us”, no “lol”, no “asl”, no “there are no girls on the internet”. No “I know rite”, no “hi everyone”, no “morning sucks.” Just thoughtful, full sentences.
There are a few obvious questions/objections, and I think each of them has been answered by experiment:
Q: Can’t you just tack a random set of letters on the end to ensure your line is unique (or misspell things, add in gibberish, etc)?
A: Of course. The moderator has plenty of holes if you’re acting in bad faith. But if you’re doing that, why are you in the channel at all? Folks who persist in doing this anyway earn (like any spammers) a prompt manual ban.
Q: Won’t it get harder and harder to chat as lines get “used up”?
A: You underestimate the number of possible sentences. We’ve been working off two years (2 million) lines of logs, and it’s not very hard at all — I expect the channel will be able to run for at least a decade before it becomes a problem, and probably long past that.
Q: What about common parts of conversation, like “yeah” and the like?
A: Surprisingly, it doesn’t seem to be a huge problem. In some cases, they can be done without entirely, and in others, you’re just forced to elaborate a little bit on what you’re agreeing with and why.
I talked it over with zigdon, a Perl guru, and he coded it up. We called the project ROBOT9000 (the most generic, unoriginal name for a bot that we could think of). Then we started a sister channel to #xkcd and put the bot in it. #xkcd-signal has been running for the last couple weeks (using the last two years of #xkcd logs) with about 60 reasonably active chatters, and it’s working beautifully — good, solid chat between relative strangers, with very little noise. (We’ll see how it handles the influx of people as we announce the experiment to the wider net.)
In zig’s implementation, the moderator bot mutes (-v) chatters for a period after every violation. The mute time starts at two seconds and quadruples with each subsequent violation, so you have five or six tries to get the hang of it. Your mute-time decays by half every six hours (we’re still tweaking the parameters). When looking for matches, the bot ignores punctuation, case, and nicks.
The big problem we ran into, actually, was meta-discussion overwhelming the channel. Every new person wanted to speculate about the rules and their effect, and every violation was followed by a long postmortem. At first, we had a scoreboard showing who was the best at talking without violation, but this quickly turned into a competition, destroying actual chat. When we took down the scoreboard and banished meta-discussion of the channel to #meta-discussion, everything worked out nicely. (And, of course, for discussion of the concept of #meta-discussion people had to go to #meta-meta-discussion, and for chat about how silly that whole idea was, we created #meta-meta-meta-discussion …)
You’re welcome to come hang out with us. The moderator bot is running in #xkcd-signal on Foonetic (irc.foonetic.net or irc.xkcd.com). But again, it’s a social channel; take discussion of the concept to #meta-discussion.
If you’d like to run this bot in your own channel, zig has published an initial version of the code here:
http://irc.peeron.com/xkcd/ROBOT9000.html (Perl bot, SQL skeleton, Changelog)