ROBOT9000 and #xkcd-signal: Attacking Noise in Chat

Edit 2: Oh God Oh God 4chan has Robot9000. soup /r9k/. Have fun with the bot and do one last barrel roll for me.

Edit: As expected, with the huge flood of new traffic after this post went up, the channel is full of new folks coming in and playing with the bot. This is unavoidable and expected for these first few days, and ROBOT9000 is actually controlling the noise pretty well. Still, #xkcd-signal is a social channel — if you just want to play games with the moderator/concept, please use #moderator-sandbox. Thanks!

#xkcd has had about 250 chatters these days. Large communities suck. This problem is hard to solve, but we’ve come up with a fun attack on it — enforced originality (in a very narrow sense). My friend zigdon and I have put together an auto-moderation system in an experimental channel, #xkcd-signal, and it seems to work well, so we invite you all to take part.

When social communities grow past a certain point (Dunbar’s Number?), they start to suck. Be they sororities or IRC channels, there’s a point where they get big enough that nobody knows everybody anymore. The community becomes overwhelmed with noise from various small cliques and floods of obnoxious people and the signal-to-noise ratio eventually drops to near-zero — no signal, just noise. This has happened to every channel I’ve been on that started small and slowly got big.

There are a couple of standard ways to deal with this, and each one has problems. Here’s an outline of the major approaches (skip down if you just want to read about ROBOT9000):

  • Strict entry requirements: This is the secret club/sorority approach. You can vet every new person before they’re allowed to speak. This sucks. It reminds me of Feynman’s comment on resigning the National Academy of Sciences — he said that he saw no point in belonging to an organization that spent most of its time deciding who to let in. The problems are apparent during sorority rush week on college campuses. Not only is the question of who does the vetting (and how) difficult, but the drama reaches horrifying levels as bitter counter-cliques rise up and do battle.
  • Moderators: This is the approach IRC channels and forums usually take. You designate a few ‘good’ people who can deal with noise as it happens, by muting, kicking, banning, or editing content as need be. There are a couple problems here — the circle of moderators has to grow with the community. It eventually becomes fairly large, with complicated dynamics of its own, and the process of choosing moderators leads to sorority/NSF-esque drama and general obnoxiousness. I don’t like the elitism that inevitably develops, and prefer more egalitarian systems.
  • Running peer-moderation: When it’s possible, this is a good approach. It’s used to great effect on comment threads, with Slashdot pioneering the whole thing and sites like reddit stripping it down to an effective core. But it doesn’t work very well for live time-dependent things like IRC channels.
  • Splinter communities: This has happened on most IRC channels I’ve been on — small invite-only side channels sprout up with particular focuses. Often, the older core members of the community go off to create their own high-signal channel, which is generally kept quiet. But this is limited — it lacks the open mixing of the internet that often makes online communities work.

I was trying to decide what made a channel consistently enjoyable. A common factor in my favorite hangouts seemed to be a focus on original and unpredictable content on each line. It didn’t necessarily need to be useful, just interesting. I started trying to think of ways to encourage this.

And then I had an idea — what if you were only allowed to say sentences that had never been said before, ever? A bot with access to the full channel logs could kick you out when you repeated something that had already been said. There would be no “all your base are belong to us”, no “lol”, no “asl”, no “there are no girls on the internet”. No “I know rite”, no “hi everyone”, no “morning sucks.” Just thoughtful, full sentences.

There are a few obvious questions/objections, and I think each of them has been answered by experiment:

Q: Can’t you just tack a random set of letters on the end to ensure your line is unique (or misspell things, add in gibberish, etc)?

A: Of course. The moderator has plenty of holes if you’re acting in bad faith. But if you’re doing that, why are you in the channel at all? Folks who persist in doing this anyway earn (like any spammers) a prompt manual ban.

Q: Won’t it get harder and harder to chat as lines get “used up”?

A: You underestimate the number of possible sentences. We’ve been working off two years (2 million) lines of logs, and it’s not very hard at all — I expect the channel will be able to run for at least a decade before it becomes a problem, and probably long past that.

Q: What about common parts of conversation, like “yeah” and the like?

A: Surprisingly, it doesn’t seem to be a huge problem. In some cases, they can be done without entirely, and in others, you’re just forced to elaborate a little bit on what you’re agreeing with and why.

I talked it over with zigdon, a Perl guru, and he coded it up. We called the project ROBOT9000 (the most generic, unoriginal name for a bot that we could think of). Then we started a sister channel to #xkcd and put the bot in it. #xkcd-signal has been running for the last couple weeks (using the last two years of #xkcd logs) with about 60 reasonably active chatters, and it’s working beautifully — good, solid chat between relative strangers, with very little noise. (We’ll see how it handles the influx of people as we announce the experiment to the wider net.)

In zig’s implementation, the moderator bot mutes (-v) chatters for a period after every violation. The mute time starts at two seconds and quadruples with each subsequent violation, so you have five or six tries to get the hang of it. Your mute-time decays by half every six hours (we’re still tweaking the parameters). When looking for matches, the bot ignores punctuation, case, and nicks.

The big problem we ran into, actually, was meta-discussion overwhelming the channel. Every new person wanted to speculate about the rules and their effect, and every violation was followed by a long postmortem. At first, we had a scoreboard showing who was the best at talking without violation, but this quickly turned into a competition, destroying actual chat. When we took down the scoreboard and banished meta-discussion of the channel to #meta-discussion, everything worked out nicely. (And, of course, for discussion of the concept of #meta-discussion people had to go to #meta-meta-discussion, and for chat about how silly that whole idea was, we created #meta-meta-meta-discussion …)

You’re welcome to come hang out with us. The moderator bot is running in #xkcd-signal on Foonetic (irc.foonetic.net or irc.xkcd.com). But again, it’s a social channel; take discussion of the concept to #meta-discussion.

If you’d like to run this bot in your own channel, zig has published an initial version of the code here:

http://irc.peeron.com/xkcd/ROBOT9000.html (Perl bot, SQL skeleton, Changelog)

1,275 thoughts on “ROBOT9000 and #xkcd-signal: Attacking Noise in Chat

  1. In reference to xkcd.com/37:
    “That was one helluva lame-ass party!”
    =
    “That was one helluva lame ass-party!”

  2. Where’d the source go? I’m interested in implementing a similar idea for my instant messenger (pidgin) and would like to have a good idea where to start.

  3. Pingback: ???? ???? ?????? ??? ?? ??????

  4. /r9k/ really stinks.
    not because of the concept, but because it attracted the wrong element.
    please ask moot to remove it.

  5. Pingback: grubbN blog » Blog Archive » K9-bot, forced originality

  6. Hah! R9K sounds like a fantasticly fun concept, though I think you might want to set a decay time on the logs so that some previous messages will be allowed again every now and then?

  7. I’m a little bit obsessed with reading xkcd.
    It’s quite alarming actually; it has crept up to the top of my ‘fun-things-to-do-when-procrastinating-doing-schoolwork-to-a-dangerous-extent’ list.

    Incidentally, I should probably get back to my assignment. :/

    But I just thought Randall ought to know that it’s gotten to that stage where he is the same to me as Nathan Fillion/Summer Glau were to that guy in comics #577 – 581.

    (Well, not quite to that extent. That was sort of creepy. But you get the idea.)

    Oh, and Firefly is TRULY awesome. :)

  8. I’m about to implement Robot9000 on the forum I’m working on, although a “nicer” version.

  9. This is a really interesting concept, because I am a linguistics student studying Internet slang. A great deal of my research is based on the fact that such slang – really, such an entire way of speaking – develops because of conversational interactivity. Part of the reason we can all say ‘asl’, of course, is because however obnoxious, we all know what it means and we can repeat it on command and riff off of it. But if we are barred from saying things that have already been said, then the kind of slang that is characteristic of the Internet – slinging catchphrases and chopping up sentences into smaller phrases sent over multiple messages – becomes impossible. Zounds!

    It also offers an interesting challenge to the concept of recursion and the generation of infinite unique utterances. Good job, Randall.

  10. Pingback: American Idols Live Tickets US Airways Center

  11. Pingback: preteens

  12. What a fun idea!
    I know that I noticed this a long time after publishing, but I couldn’t resist.
    In general, this is practical social psychology.
    Give a social channel a set of rules, and they’ll:
    a) Leave, if they don’t like the rules.
    b) Play with the rules, to see what’s allowed.
    c) Find ways to go around the rules.
    d) Keep silent.
    And I’m sure there are many others.

  13. Pingback: What are the most innovative ways of handling comments on the web? - Quora

  14. /r9k/ has now been removed from 4chan. I guess it works better as an IRC filter, rather than an imageboard/forum filter.

  15. Thank you for the article. Can I share that article on my website? If you allow, please contact with me via my email address.

  16. Randall, I thought you might like to know… I attended a dissertation defense (biology) yesterday wherein the candidate used the xkcd herpetology comic as a slide. Moreover, her master’s work involved her shooting porpoises from a speedboat with a crossbow in order to collect tissue samples. Yeah, biologists have a lot of fun.

  17. I cannot understand what this article is all about but I really do like to put my comments over here. I really like interacting with other people.

    Thanks guys.

  18. My blog has opened so many doors for me and has helped me land quite a few jobs. By being immersed in writing and showing initiative, having a blog has been a great platform and portfolio. And I would recommend to anyone looking to start or build a portfolio to have a blog.

  19. As long as a dress is done this but overall modelling, summer relaxed and beauty are favorite! In this summer, the pure color dress high v lens, simply cutting, can show different style, show your Sale Supra Skytop Shoes charm of graceful figure curve. The popular pure color, the colour of sweet dress and elegant style, show beauties the temperament, let you in this summer blossoms grace. Nike Dunk sb low

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>