Blogging…Now 99.6% Safer Than Surfing!

OK, this is an oldie but goodie from xkcd, but I saw it in a recent presentation and couldn’t resist sharing.

Of course, you’d never know that from the sensationlist press.

By Daniel Tunkelang

High-Class Consultant.

6 replies on “Blogging…Now 99.6% Safer Than Surfing!”

Of course, since that was first published there’s been more than a bit of self-referential measurement error introduced. The current count for “skydiving” results is 633, for “blogging” 652, and for “knitting” more than 10,700.

Whether that indicates high overlap in the sample sets or a deadly wave of recent yarn & needle fatalities, I can not imagine.


I can’t speak to the rigor of journalists–or bloggers–in obtaining their statistics.

But, in all seriousness, the techniques that some people have discussed for learning from distributional similarity in documents and query logs are strikingly in line with that xkcd comic.

It does make you (or at least me) wonder how to game the log-mining approaches by spamming the query log, much as spammers already exploit distributional similarity to create documents that get past spam filters.


Seth Finkelstein–haven’t seen him since I was an undergrad at MIT! But I do agree with him and Dave Weinberger that Google’s counts are systematically skewed, though it’s not clear what causes that skew. Mr. Schmidt, tear down this black box!


Comments are closed.