Recommending Diversity

Another nice post from Daniel Lemire today, this time about a paper by Mi Zhang and Neil Hurley on “Avoiding monotony: improving the diversity of recommendation lists” (ACM Digital Library subscription required to see full text).

Here’s an abstract of the abstract:

Noting that the retrieval of a set of items matching a user query is a common problem across many applications of information retrieval, we model the competing goals of maximizing the diversity of the retrieved list while maintaining adequate similarity to the user query as a binary optimization problem.

It’s nice to see a similarity vs. diversity trade-off for recommendations analogous to the precision vs. recall trade-off for typical information retreival evaluation.

Our experience at Endeca is certainly that most of the approaches out there underemphasize diversity, which not only leads to the “monotony” problem but also breaks down when the query does not unambiguously express the user’s intent. Since our approach emphasizes interaction, we leverage the diversity of the options we present to maximize the opportunity for users to make progress in satisfying their information needs.

I would like to second Daniel Lemire’s suggestion to perform user studies to investigate the optimal balance between diversity and accuracy. They’d make for great papers. Just remember to send him (and me!) copies!

By Daniel Tunkelang

High-Class Consultant.

10 replies on “Recommending Diversity”

One nice thing about recall/precision, is that to compute them, you just need to know whether document x is relevant or not to query y. Measuring diversity requires that you also know the “distance” between document x1 and x2. It is not hard, but it makes up an extra layer of math.


True. But the nice thing about diversity is that it’s an unsupervised measure. I’m a big fan of such unsupervised set measures, like the query-less version of query clarity that we use at Endeca. There really should be unsupervised analogs of the precision vs. recall trade-off that can be used in the absence of relevance assessments.


Diversity in results seems surprisingly underrepresented in both academia and industry. Saw a couple of posters at SIGIR07 on it, but they were largely calculating the diff between 2 pages, and not reporting what made each unique! It seems all the more important in faceted search, for example, where every result is often equally related to the selections made by the user. There was a group that presented at SearchSolutions2008 who weight every attribute given to an object in faceted search. Im not sure how they can put a weighting on things like the manufacturer etc.


Indeed, it’s hard to do research on a problem that doesn’t have accepted measures for evaluation.

As for putting a weight on each attribute, I suppose you can turn anything into a vector space (e.g., by treating each manufacturer as a binary value). But distance measures can do funny things in high-dimensional spaces.


I’ve talked with insiders at Google, and they claim that they strive to automatically generate diversity in their rankings. I believe what I’ve heard, but Google is still such a black box that I have no idea how to evaluate or understand exactly how much diversity is being created.


And how does the story continue?

I was very surprised to see a light break over the face of my young relevance assessor:

“That is exactly the way I wanted it! Do you think that this diversity will have to have a great deal of click-throughs?”


“Because where I live everything must be relevant…”

“There will surely be enough click-throughs from it,” I said. “It is a very large diversity that I have given you.”

He bent his head over the SERP.

“Not so large that–Look! It has gone to sleep…”

Non? 😉


Busy letting the boa constrictors of navigational, known-item search swallow the elephants of relevance? 😉


Comments are closed.