The other day, I was talking with Paul Thompson about the challenges of evaluating interactive information retrieval (IIR) systems, and he mentioned a paper that came up in discussion at the SIGIR 2009 workshop on Understanding the User: “A probability ranking principle for interactive information retrieval” by Norbert Fuhr–an update to the decades-old probability ranking principle (PRP) for information retrieval.
I was embarrassed to admit that I’d never seen or even heard of this paper, despite Nick Belkin citing it in the ECIR 2008 keynote that inspired my first blog post! In my defense, the citation is a single sentence that offers more of a tease than an explanation of the paper’s thesis.
Have no fear; I’ll offer more than a sentence here! I’ll summarize the paper and then offer my personal reaction. Let’s start with a few lines from the abstract:
In this paper, a new theoretical framework for interactive retrieval is proposed: The basic idea is that during IIR, a user moves between situations. In each situation, the system presents to the user a list of choices, about which s/he has to decide, and the first positive decision moves the user to a new situation. Each choice is associated with a number of cost and probability parameters. Based on these parameters, an optimum ordering of the choices can the derived – the PRP for IIR.
The first two sections of the paper provide an introduction and motivation. I’ll skip these, other than excerpt the following:
Interactive retrieval consists of user actions of various types, and scanning through document lists for identifying the relevant entries is not the most crucial activity in interactive retrieval [Turpin & Hersh 01]. In contrast, other activities (like e.g. query reformulation) seem to be more ‘expensive’ from the user’s point of view.
It’s in the third section that Fuhr starts outlining his approach. He established three requirements that, in his view, a PRP for IIR must satisfy:
- Consider the complete interaction process.
- Allow for different costs and benefits of different activities.
- Allow for changes of the information need.
He then makes four simplifying assumptions:
- Focus on the functional level of interaction (i.e., ignore design, visualization).
- Decisions are the major interaction activity.
- Users evaluate choices in linear order.
- Only positive, correct decisions are of benefit for a user.
With these assumptions in hand, Fuhr establishes an interaction model comprised of a sequence of “situations”:
a situation reflects the system state of the interactive search a user is performing…a situation consists of a list of choices the user has to evaluate…the first positive decision by the user will move him to another situation (depending on the choice he selected positively)…assume that there is always a last choice that will move him to another situation…the user’s information need does not change during a situation, knowledge is added only when switching to another situation due to a positive decision
In the fourth section, Fuhr describes a cost model for IIR. The key concept is that each choice incurs an effort, yields an expected benefit, and incurs an additional cost if the user has to backtrack. His overall framework is generic, but he offers a concrete “illustrating example” in which:
- queries represent conjunctions of query terms
- choices represent narrowing query refinements that add individual terms to the current query
- the probability of the user choosing a given query term is proportional to that term’s frequency in the corpus
- the benefit of an accepted choice is the log of factor by which it reduces the result set
Given such a cost model, the goal of an IIR system is now to maximize the user’s overall expected benefit, which in turn requires making tradeoffs at each choice. Fuhr offers an example to illustrate these tradeoffs:
As a simple example, assume that the system proposes some terms for query expansion. As one possibility, only the terms themselves are listed. Alternatively, for each term, the system could show a few example term occurrences in their context, thus giving the user some information about the usage of the term. The user effort per choice is lower in the first case, but the decisions will also be more error-prone.
The fifth section explains optimum ranking of choices according to the PRP. It has the most math, and I’ll refer the interested reader to the paper. The section that intrigued me the most was the sixth, entitled “Towards application”, in which Fuhr considers the various components of his generic framework, and speculates how existing and future research may supply the concrete details needed to instantiate it. Finally, Fuhr wraps up by citing related work and offering a conclusion/ outlook.
That’s the summary–perhaps it will inspire some of you to read the whole paper. Personally, I’m intrigued by Fuhr’s model. I suspect I was trying to back into a similar approach in my post on “Ranked Set Retrieval” a few months ago. But, even if I’d try to formalize my proposal, I don’t believe I would have ever arrived at a framework as general as Fuhr’s.
That said, Fuhr’s article is like a fancy dinner that leaves me hungry at the end. It’s a solid paper: general framework, suggestive examples, and good writing generally. Nonetheless, it doesn’t give me what I really wanted: to feel closer to either a cost-effective evaluation methodology for IIR–or a tool that can be embedded into an IIR system to improve user experience. I know it’s a theory paper, and I should judge it on its own terms, but I was hoping to see more immediate applicability.
Given that Fuhr has published a fair amount of applied research, I’m hopeful about his future work in this area. I do appreciate the formal framework he has proposed, and I’d like to think that people are already looking at way to build on that framework. Perhaps I’ll be one of them.