Thanks to Jon Elsas for calling my attention to a great post at Datawocky today on how Google measures search quality, written by Anand Rajaraman based on his conversation with Google Director of Research Peter Norvig.
The executive summary: rather than relying on click-through data to judge quality, Google employs armies of raters who manually rate search results for randomly selected queries using different ranking algorithms. These manual ratings drive the evaluation and evolution of Google’s ranking algorithms.
I’m intrigued that Google seems to wholeheartedly embrace the Cranfield paradigm. Of course, they don’t publicize their evaluation measures, so perhaps they’re optimizing something more interesting than mean average precision.
More questions for Amit. 🙂