ROBOT9000 and #xkcd-signal: Attacking Noise in Chat

Edit 2: Oh God Oh God 4chan has Robot9000. soup /r9k/. Have fun with the bot and do one last barrel roll for me.

Edit: As expected, with the huge flood of new traffic after this post went up, the channel is full of new folks coming in and playing with the bot. This is unavoidable and expected for these first few days, and ROBOT9000 is actually controlling the noise pretty well. Still, #xkcd-signal is a social channel — if you just want to play games with the moderator/concept, please use #moderator-sandbox. Thanks!

#xkcd has had about 250 chatters these days. Large communities suck. This problem is hard to solve, but we’ve come up with a fun attack on it — enforced originality (in a very narrow sense). My friend zigdon and I have put together an auto-moderation system in an experimental channel, #xkcd-signal, and it seems to work well, so we invite you all to take part.

When social communities grow past a certain point (Dunbar’s Number?), they start to suck. Be they sororities or IRC channels, there’s a point where they get big enough that nobody knows everybody anymore. The community becomes overwhelmed with noise from various small cliques and floods of obnoxious people and the signal-to-noise ratio eventually drops to near-zero — no signal, just noise. This has happened to every channel I’ve been on that started small and slowly got big.

There are a couple of standard ways to deal with this, and each one has problems. Here’s an outline of the major approaches (skip down if you just want to read about ROBOT9000):

  • Strict entry requirements: This is the secret club/sorority approach. You can vet every new person before they’re allowed to speak. This sucks. It reminds me of Feynman’s comment on resigning the National Academy of Sciences — he said that he saw no point in belonging to an organization that spent most of its time deciding who to let in. The problems are apparent during sorority rush week on college campuses. Not only is the question of who does the vetting (and how) difficult, but the drama reaches horrifying levels as bitter counter-cliques rise up and do battle.
  • Moderators: This is the approach IRC channels and forums usually take. You designate a few ‘good’ people who can deal with noise as it happens, by muting, kicking, banning, or editing content as need be. There are a couple problems here — the circle of moderators has to grow with the community. It eventually becomes fairly large, with complicated dynamics of its own, and the process of choosing moderators leads to sorority/NSF-esque drama and general obnoxiousness. I don’t like the elitism that inevitably develops, and prefer more egalitarian systems.
  • Running peer-moderation: When it’s possible, this is a good approach. It’s used to great effect on comment threads, with Slashdot pioneering the whole thing and sites like reddit stripping it down to an effective core. But it doesn’t work very well for live time-dependent things like IRC channels.
  • Splinter communities: This has happened on most IRC channels I’ve been on — small invite-only side channels sprout up with particular focuses. Often, the older core members of the community go off to create their own high-signal channel, which is generally kept quiet. But this is limited — it lacks the open mixing of the internet that often makes online communities work.

I was trying to decide what made a channel consistently enjoyable. A common factor in my favorite hangouts seemed to be a focus on original and unpredictable content on each line. It didn’t necessarily need to be useful, just interesting. I started trying to think of ways to encourage this.

And then I had an idea — what if you were only allowed to say sentences that had never been said before, ever? A bot with access to the full channel logs could kick you out when you repeated something that had already been said. There would be no “all your base are belong to us”, no “lol”, no “asl”, no “there are no girls on the internet”. No “I know rite”, no “hi everyone”, no “morning sucks.” Just thoughtful, full sentences.

There are a few obvious questions/objections, and I think each of them has been answered by experiment:

Q: Can’t you just tack a random set of letters on the end to ensure your line is unique (or misspell things, add in gibberish, etc)?

A: Of course. The moderator has plenty of holes if you’re acting in bad faith. But if you’re doing that, why are you in the channel at all? Folks who persist in doing this anyway earn (like any spammers) a prompt manual ban.

Q: Won’t it get harder and harder to chat as lines get “used up”?

A: You underestimate the number of possible sentences. We’ve been working off two years (2 million) lines of logs, and it’s not very hard at all — I expect the channel will be able to run for at least a decade before it becomes a problem, and probably long past that.

Q: What about common parts of conversation, like “yeah” and the like?

A: Surprisingly, it doesn’t seem to be a huge problem. In some cases, they can be done without entirely, and in others, you’re just forced to elaborate a little bit on what you’re agreeing with and why.

I talked it over with zigdon, a Perl guru, and he coded it up. We called the project ROBOT9000 (the most generic, unoriginal name for a bot that we could think of). Then we started a sister channel to #xkcd and put the bot in it. #xkcd-signal has been running for the last couple weeks (using the last two years of #xkcd logs) with about 60 reasonably active chatters, and it’s working beautifully — good, solid chat between relative strangers, with very little noise. (We’ll see how it handles the influx of people as we announce the experiment to the wider net.)

In zig’s implementation, the moderator bot mutes (-v) chatters for a period after every violation. The mute time starts at two seconds and quadruples with each subsequent violation, so you have five or six tries to get the hang of it. Your mute-time decays by half every six hours (we’re still tweaking the parameters). When looking for matches, the bot ignores punctuation, case, and nicks.

The big problem we ran into, actually, was meta-discussion overwhelming the channel. Every new person wanted to speculate about the rules and their effect, and every violation was followed by a long postmortem. At first, we had a scoreboard showing who was the best at talking without violation, but this quickly turned into a competition, destroying actual chat. When we took down the scoreboard and banished meta-discussion of the channel to #meta-discussion, everything worked out nicely. (And, of course, for discussion of the concept of #meta-discussion people had to go to #meta-meta-discussion, and for chat about how silly that whole idea was, we created #meta-meta-meta-discussion …)

You’re welcome to come hang out with us. The moderator bot is running in #xkcd-signal on Foonetic ( or But again, it’s a social channel; take discussion of the concept to #meta-discussion.

If you’d like to run this bot in your own channel, zig has published an initial version of the code here: (Perl bot, SQL skeleton, Changelog)

644 thoughts on “ROBOT9000 and #xkcd-signal: Attacking Noise in Chat

  1. Hmm..Randall, I thought you might like to know… I attended a dissertation defense (biology) yesterday wherein the candidate used the xkcd herpetology comic as a slide. Moreover, her master’s work involved her shooting porpoises from a speedboat with a crossbow in order to collect tissue samples. Yeah, biologists have a lot of fun.

  2. Pingback: “If I Were President…” Of A Social News Website | Ask Rea Maor (dot) Com - Technology and Money Making at its best

  3. Pingback: Library « Faistiq

  4. i just wanted to thank Randell for all of XKCD comic strips, they are currently tied with Calvin and Hobbes in my mind. (this should be taken as a complement on the off chance you don’t love Calvin and Hobbes)

  5. Hmm..Randall, I thought you might like to know… I attended a dissertation defense (biology) yesterday wherein the candidate used the xkcd herpetology comic as a slide. Moreover, her master’s work involved her shooting porpoises from a speedboat with a crossbow in order to collect tissue samples. Yeah, biologists have a lot of fun.

  6. Pingback: Quora

  7. When i see to the unblocked games i am very wonderful .me and my family likes the unblocked games websites. I hartly agree with this site.
    children very interest the games.Some children are not tolerable to play games at their residence by their father and mother. As a why, they try to access unblocked games at their school when their teacher is giving lectures. There are some advantages of surfing the online worlds. This is very friendly websites.

  8. Would not want to tie Calvin and Hobbes anyplace, especially in someone’s mind.
    Hoping this is ok place to comment on xkcd what if.

    re: train loop Just make shorter train with wheels nearer the ends and 200meter radius works out ok, with something like 5 or 6 g’s. Put the passengers in appropriately shaped water tanks and you can do loops until the fuel runs out.
    Re getting Voyager back:
    -What about turning and/or braking using interstellar magnetic field and built up charge (maybe using that ion drive to do it?)
    -What about Orion project type spacecraft? You’re willing to do all sorts of other crazy things that are not politically feasible, after all. Significant fractions of C attainable.
    -A simpler method: A one ton (more or less) steel hatch was observed to be accelerated to something like 6X escape velocity. If one could design a spacecraft stronger than steel, within, perhaps, a uranium shell (or some other high temp material, but we know how to do it with uranium, unfortunately), you might get the acceleration done all at once. Maybe.
    Really liked the one about the longest sunset. I would not have been patient enough to chase that one down.
    BTW, I failed on my first trial with the Turing test.

  9. Pingback: Large Communities Suck »

  10. Dear xkcd.

    I am a fan of yours and I use your comics in a class I co-teach with Fred Goldhaber on quantum mechanics in popular culture. Last month Physics World, where I am a columnist, reprinted one of your cartoons, at my insistence, to accompany one of my columns.

    At present we are writing a book on the subject, to be published by Norton. 7500 copies in the first printing. We would like to use a few of your wonderful cartoons in various chapters. I know your web page says that we can use the cartoons freely for not-for-profit use — and this kind of book is sort of not for profit. Miniscule profit, perhaps. In any case, we wanted to check. Here’s what we want to use:

    “Science: it works, bitches” (54)
    “Pauli Sexclusion Principle” (658)
    “Prairie” (967)
    “Quantum Teleportation” (465)
    “Christmas Plans” (679)
    “Schrodinger” (45)

    Regarding 967, the book is alas in black and white, though the amber will be lost. Still hysterically funny to us. Can we use these?

    Very sincerely,

    Robert P. Crease
    Professor, Department of Philosophy
    Stony Brook University

  11. I was one of the originators of chat (an accident on a BB called The Well – one night we realized that posts on a thread could be immediately answered and the result would be a scrolling dialogue – helped start Apple’s eWorld chat service (AOL chat was a stub of their a game service at the time).

    Eventually worked on AOL chat – we developed a robust noise-suppression structure (thousands of hosts and “guides”).

    IMHO, is currently the intellectual apogee of internet commentary.

    Would be interested in discussing an HTML5 chat service with xkcd…

  12. Fuck this stupid ass bot, it’s ruining Twitch chat. You can type anything because of this cancerous aids nigger shit. It must’ve been made by dirty monkey niggers.

  13. Hi Randall. I’m Lynne. I know you don’t take ideas from this channel, however this one could possibly be considered for exception. A dear friend from San Antonio and I were musing about the scholarly phraseology embedded in the derivatives of “y’all.” For instance perhaps there could be a variation of UML called “Y’all-ML” where a many to many relationship can be represented by a phrase like “Y’all have y’all’s names.” I think I got a hernia laughing about the possibilities. If you like it, the idea is yours to abuse. Tot ziens en veel gluck!

  14. ░░░░░▄▄░░░░▄░░░░▄░░░ ░░░░░░█▄░░░█░░█▀░░░░ ░░░░░░░▀█▄░▀░░░░░░░░ ░░░░▄░░░▄▄███▄░░░▄▄░ ░░░▀▀░░▄█░░█░█▄░░░▀█ ░░░░░░░█░░░░░░█░░░░░ ░░░░░░░████████░░░░░ ░░░░░░░█▄▄░░░░█░░░░░ ░░░░░░░█░░░░▀▀█░░░░░ ░░░░░░░█▀▀▀░▄▄█░░░░░ ░░░░░░░█░░░░░░█▄░░░░ ▄▄▄▄██▀▀░░░░░░░▀██░░ ░▄█▀░▀░░░░▄░░░░░░█▄▄ ▀▀█▄▄▄░░░▄██░░░░▄█░░ ░█▀█▄▄▄▄█▀░██▄▄██▄▄░ ░░░░▀░░░▀░░░▀░░░░░░░

  15. Alright, I am not a programmer by any means, but I have found a major flaw in this reasoning. You say you want “thoughtful, full sentences.” Well I was on a stream, and I commented about analysis regarding an LCS team that made a huge turnaround in the middle of the season. It was eloquent and well-thought-out, because I’m not a douchebag and I don’t like to troll. And when I commented it, I was told that my comment “was not unique.” What is this crap? I worked pretty damn hard to analyze the causes of the upswing of that team, and it doesn’t show up because your robot thinks I’m repeating someone else. I was extremely upset, and it has since made me stop talking in that chat altogether in fear that everything I say will be “not unique” and all of my thoughts will be hidden by your bot. Which, I understand, is exactly what your program does NOT want to accomplish. Please take this into consideration when you think it’s a good idea to hide everything that has ever been said since the dawn of time. Maybe you should gather data about how many times some things are said, and hide the common and uninteresting ones. Don’t rush into battle without knowing your enemy.

  16. I only chat in Chinese to others chinese-using people

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>