A Problem

I think I have a Bash problem.  What follows is an actual command from my history.

cat /usr/share/dict/words | fgrep -v "'" | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."n";' | tee lookup.txt | perl -pe 's/^([^ ]+) .*/1/g' | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | uniq -c | sort -nr | egrep "^[^0-9]+2 " | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | perl -pe 's/[ 0-9]//g' | xargs -i grep {} lookup.txt | perl -pe 's/[^ ]+ //g' | tail -n2

It’s just so hard to bite the bullet, admit that the problem has grown in scope, and move it to its own Perl/Python script.  (P.S. The Guinness Book is wrong.  “Conservationalists” is not a real word.)

Edit: to those who are competing in the comments to improve (shorten) the above command: when pasting code, use the <code> tag to override WordPress quote formatting.

Joey Comeau has a new book out based on Overqualified, which has long been one of my favorite things on the internet.  He writes cover letters to companies.  They each sound businesslike enough for the first paragraph or so, and then you gradually realize you are reading something that is in no way a normal cover letter.  An excerpt from one to Nintendo:

We need a new Mario game, where you rescue the princess in the first ten minutes, and for the rest of the game you try and push down that sick feeling in your stomach that she’s “damaged goods”, a concept detailed again and again in the profoundly sex negative instruction booklet, and when Luigi makes a crack about her and Bowser, you break his nose and immediately regret it. When Peach asks you, in the quiet of her mushroom castle bedroom “do you still love me?” you pretend to be asleep. You press the A button rhythmically, to control your breath, keep it even.

#2 (NeoPost), #28 (Phone surveys) and #58 (MySpace) are three of my favorites.

137 replies on “A Problem”

  1. I like the idea of that applicant conducting phone surveys to see what new content should be featured on myspace, if his surveys are anything like the ones he conducts in #28. Obviously, if he opens with ‘What’s a clitoris?” he’ll close with “Put lesbians on myspace.”

    Like

  2. You should really skip the cat and just give the filename straight to fgrep.

    Yes, that’s the first thing I thought upon reading you’re command. 😛

    Like

  3. My grep doesn’t like list entries starting with a “-“. I propose “xargs -i grep — {} lookup.txt”.

    Like

  4. I see you learnt tee (and stripped out some egrep aliases) since I last saw that command. Excellent.

    Like

  5. This book concept reminds me a lot of “Letters from a Nut” by Ted L. Nancy. My brother and I used to pour over it when we were younger. Nancy would publish both the bizarre letters he sent to various companies and institutions (with strange hotel room requests or pitching a new product) and the replies he received. This turned into a bit of a series with a couple of books now out following the same format. Joey Comeau’s letters seem more biting, dark, and clever. Nancy’s could sometimes grind as too simple or formulaic. From the examples you’ve given, I don’t expect that will be a problem with “Overqualified”. Thanks for the heads-up…

    Like

  6. usually by about the third time i invoke perl in a single command i get the point and just turn the whole thing into a perl one-liner…

    Like

  7. Perhaps your problem is that you are trying to fit your task into just one pipeline instead of two:

    lookup.txt;
    join lookup.txt lookup.txt |
    awk '$2<$3 && $1=length($1)' |
    sort -n |
    sed 's/[0-9]* //' |
    tail -n2

    Like

  8. Another problem is that you should define what transformations your blog engine does to comment texts… let’s try escaping the redirection signs:

    < /usr/share/dict/words
    perl -F -lane "next if /'/;"'print join("", sort @F)." $_";' |
    sort > lookup.txt;
    join lookup.txt lookup.txt |
    awk '$2<$3 && $1=length($1)' |
    sort -n |
    sed 's/[0-9]* //' |
    tail -n2

    Like

  9. I couldn’t resist golfing the command. Here’s what I came up with:

    fgrep -v ' /usr/share/dict/words | perl -nle 'print length, "t$_t", sort split ""'' | sort -k 1nr -k 3 | uniq -f 2 -D | head -n 2 | cut -f 2

    Using something like a Guttman–Rosler transform it’s possible to avoid the external look-up file.

    Like

  10. If you’re golfing, you’re leaving way too much whitespace and using the long versions of a lot of idioms:

    perl -nle'!/47/&&print length,"t$_t",sort split""' /usr/share/dict/word|sort -k1nr -k3|uniq -f2 -d|head -n2|cut -f2

    Like

  11. Aside from the inexplicable use of that painful language awk, that looks a lot like command lines I come up with every day.

    Like

  12. Any reason to use ‘[^ ]’ rather than ‘S’ ?

    And my workmate is laughing at me because that was my only comment on reading the line

    Like

  13. Conversationalists is a real word.

    Love,
    Your friendly neighborhood English teacher.

    Like

  14. Doh! Conversationalists is a real word, but Conservationalists is not. I am a dyslexic English teacher, it turns out.

    Like

  15. I just can’t resist the urge to share that “Bash problem” in Turkish means “quite a problem”

    Like

  16. Smylers: nice.

    Sweth: You forgot to cut out words with apostrophes.

    I’m sorry the blog is replacing quotes and such in your commands — try wrapping them with (I'll edit a couple comments to do that)

    Like

  17. xkcd: Thanks. I think the !/47/ in Sweth’s is supposed to be !/47/, which does skip apostrophe’s (octal Ascii).

    Sweth: Yeah, I was using the term loosely ? refactoring it to something that had fewer components but was still readable enough to be a plausible way of doing it. For properly golfing to the fewest characters the shortest I’ve found is:

    perl -nle /'/'||print y///c,"t$_t",sort split""' /usr/share/dict/words|sort -k1n -k3|uniq -Df2|tail -2|cut -f2

    Specifically that’s: using a literal apostrophe (taking advantage of slashes not needing quoting, so only starting the quoted string after them); avoiding the ! (and the need to quote it) by inverting the && to ||; using y///c as a shorter way of spelling length; avoiding the r in the sort ordering by using tail instead of head; swapping the order of uniq‘s options so they can be combined; and omitting the n from tail (using the legacy syntax of specifying the number of lines as an option directly).

    But I think that’s now lost all plausibility for a recommended way of doing this!

    Like

  18. Wow, that looks great. Apparently John Campbell of Pictures for Sad Children is reading it too, so if all my favorite webcomic artists are ten I guess I should…

    Like

  19. Not knowing awk or perl, I usually end up piping back and forth between bash and sed, using almost only sed’s s command and, with the bash part, echo and for, with the occasional =.
    I think you’ve just expanded my toolbelt.

    CAPTCHA: more pompous

    Like

  20. I caught that QC reference in the comic today. Har-har-har.

    $ sudo rm -rf /bin

    (DO NOT WANT)

    Or on Windows:

    C:> del C:WindowsSystem32HAL.DLL

    Like

  21. Also, my school network has no internal security, so my favorite command is this:

    C:> pskill \NETWORK_ADDRESS explorer.exe

    Completely destroys their GUI and they have no idea what’s going on.

    Like

  22. That book also sounds a LOT like Please Let Me Help by Zack Sternwalker which is actually the funniest zine ever.

    Like

  23. Actually, on OS X and BSD I get the words
    cholecystoduodenostomy / duodenocholecystostomy

    My version of the pipeline, keeping original ideas but throwing out useless sorts: (might almost be POSIX, not using any GNU extensions at least, afaict)


    grep -v ' /usr/share/dict/words | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."n";' | tee lookup.txt | awk '{ $2=""; print length, $0 }' | sort | uniq -c | awk '/^ *2 / { print length, $3 }' | sort -n | sed -n '$s/^[ 0-9]*//p' | xargs -I '{}' grep {} lookup.txt | cut -d -f2

    You should make a habit of filtering /before/ lookup (the $ in the sed), this and removing the sorts speeds things up slightly.
    However, I don’t like the idea of grep’ing over something written by an earlier tee, feels uncomfortable and makes me wonder if I might run into odd buffering issues.

    Like

  24. I simply had to post because it seems recaptcha is promoting some new, homoerotic S&M sitcom called 2 1/2 manacles

    Like

  25. Dude that comic instules everone on the austistc spectrum….
    Just that you may not have a specrum disorder that dosent mean outher linux user with highspeed wongt

    (Raports 4 evar)
    (ps just use a c program to query aspell with you bash issue)

    Like

  26. Hanners is my personal favorite 😀
    The Mario excerpt reminds me of the Ramayana, the Hindu myth. Rama fights an entire war to save his wife from an alleged demon-king and after winning the war, he dumps her, believing her to be damaged goods. This is the guy some of my people worship…

    Like

  27. is there some proto-sentient AI running amok in here or something?
    have you somehow connected Bucket to the blag script?

    Like

  28. Before you execute untrusted obfuscated code from comment threads, you should switch to an unprivileged user, ie. one of

    sudo su nobody
    sudo -u nobody -s

    Like

  29. Joey Comeau is a talented and disturbed young man. His letters are great; there’s always the unspoken, but heavily implied, notion that the writer* grew up in an environment of constant passive-agressive resentment, or shared an intimate moment with his uncle in the toolshed when he was seven.

    re: xkcd #575 – Do any music applications actually support tagging in the manner suggested by today’s comic? I’d love to be able to label songs as SFW or not, or describe the song’s mood with more than one cookie-cutter phrase, or tag a band with its influences. OS doesn’t really matter; I have Vista, Ubuntu, and Mandriva at my disposal, and maybe Leopard (if I broke the Mac, my wife would kick me out until I bought her a new computer, plus some new games, accessories, and a box of really good chocolates.)

    * I don’t mean Joey Comeau, the author of /Overqualified/, but “Joey Comeau,” the character who applies for each of these jobs.

    captcha: Sunday Cubicles. So yeah, Peter, if you just could go ahead and come in…

    Like

  30. #65 sums up everything I like most about Joey Comeau’s writing.

    (My CAPTCHA words are “keep feminist”!)

    Like

  31. Just thought I’d pitch in and add that I’m seeing a re-captcha of “marching grande”, which instantly reminded me of strip #389

    Like

  32. where do you get YOUR wordlist?

    xkcd’s version:
    real 0m21.943s
    user 0m13.716s
    sys 0m2.693s

    Smylers’ version:
    real 0m2.499s
    user 0m2.373s
    sys 0m0.013s

    Now I know what a GRT is; thanks Smyler! And nice UNIX-foo too.

    Like

Comments are closed.

%d bloggers like this: