Rethinking the ESP Game

November 25th, 2008

Thanks to Amir Michail for a tweet that alerted me to a technical report by Ingmar WeberStephen Robertson, and Milan Vojnovic on “Rethinking the ESP Game“.

The ESP Game is a human-based computation game designed by Luis Von Ahn to tag images. Part of the his motivation for developing the game was that image tagging was too hard for machines. Hence, he decided to make it fun for humans to volunteer their own labor to the cause.

But it turns out that, once humans have supplied some initial labels to the game, a machine can take over. Here is the abstract for the technical report:

The ESP Game was designed to harvest human intelligence to assign labels to images – a task which is still difficult for even the most advanced systems in image processing. However, the ESP Game as it is currently implemented encourages players to assign “obvious” labels, which are most likely to lead to an agreement with the partner. But these labels can often be deduced from the labels already present using an appropriate language model and such labels therefore add only little information to the system.

We present a language model which, given enough instances of labeled images as training data, can assign probabilities to the next label to be added. This model is then used in a program, which plays the ESP game without looking at the image. Even without any understanding of the actual image, the program manages to agree with the randomly assigned human partner on a label for 69% of all images, and for 81% of images which have at least one “off-limits” term assigned to them. We then show how, given any generative probabilistic model, the scoring system for the ESP game can be redesigned to encourage users to add less predictable labels, thereby leading to a collection of informative, high entropy tag sets. Finally, we discuss a number of other possible redesign options to improve the quality of the collected labels.

Daniel Lemire commented recently that spammers may help make AI a reality. It’s interesting to see how yesterday’s Turing Test has become today’s CAPTCHA, and the competition between humans and machines is making both smarter.

  • 1 Jason Adams // Nov 25, 2008 at 10:15 am

    Encouraging diversity is the reason for the taboo words, of course, but adding a scoring model that reinforces it is a good idea. One of the goals of human computation was to create an effective arms race (before, the spammers were just invaders occupying our territory), and it’s good to see that that is extending beyond the CAPTCHA.

    I was musing yesterday that someday (soon?) we’ll have Google Implanted Search (beta) cybernetic implants in our brain. Your comment that human computation is making both humans and machines smarter puzzled me at first. On the one hand, perhaps you are referring to researchers getting smarter by finding new ways of improving AI thanks to human data. Or are you saying that the output of machines will lead to better information retrieval (or just software in general) that in turn will lead to increased efficiency in information acquisition for humans (or both)? I think in the second case, along with our Google implants, we may all one day be Renaissance men and women.

  • 2 Daniel Tunkelang // Nov 25, 2008 at 11:31 am

    It’s hard to resist a snarky comment on the safety of Silicon Valley implants. But what I meant by my comment is that the adversarial nature of the attention economy is, as Daniel Lemire pointed out, leading to real advancement in artificial intelligence. We are learning more–not just retrieving information more efficiently of effectively–and we are creating increasingly intelligent machines. I never thought I’d say this, but thanks be to spam!

