As Nick Belkin pointed out in his recent ECIR 2008 keynote, a grand challenge for the IR community is to figure out how to bring the user into the evaluation process. A key aspect of this challenge is rethinking system evaluation in terms of sessions rather than queries.
Some recent work in the IR community is very encouraging:
– Work by Ryen White and colleagues at Microsoft Research that mines session data to guide users to popular web destinations. Their paper was awarded Best Paper at SIGIR 2007.
– Work by Nick Craswell and Martin Szummer (also at Microsoft Research, and also presented at SIGIR 2007) that performs random walks on the click graph to use click data effectively as evidence to improve relevance ranking for image search on the web.
– Work by Kalervo Järvelin (at the University of Tampere in Finland) and colleagues on discounted cumulated gain based evaluation of multiple-query IR sessions that was awarded Best Paper at ECIR 2008.
This recent work–and the prominence it has received in the IR community–is refreshing, especially in light of the relative lack of academic work on interactive IR and the demise of the short-lived TREC interactive track. They are first steps, but hopefully IR researchers and practitioners will pick up on them.
4 replies on “Multiple-Query Sessions”
A few years ago, James Allan and I somehow convinced a good-sized group of students to browse through a proxy, mark search session boundaries, and annotate relevant documents. We ran a bunch of session-based experiments to test user modeling and personalization with mixed results. We submitted a write-up to a couple of conferences but turned up goose eggs. Paging through the document with a tad more experienced eyes, I can see why. The somewhat painful write-up can be found here. For a more developed approach to this problem (with better results), I suggest Xuehua Shen’s work on adaptive IR.
Fernando, that’s an interesting study, but I can see why you found it painful. Personally, I’m most interested in the process of query reformulation. It seems that an effective system should support this process, at least in the fulfillment of a single information need across multiple queries. I’m a fan of guided reformulation (I do work for Endeca after all!), but even a purely ranked retrieval approach would support reformulation if it was predictable.As for the UCAIR work at UIUC, I tried playing with it last year, but didn’t see a perceptible change in experience. Have you tried it out yourself?
I’ll have to admit basing my opinion only on Xuehua’s published/presented results. But if you didn’t see any perceivable effect on your experience, I’d give it a few months over a large user base.
[…] am sure that much of this work has already been done (as Daniel Tunkelang points out), but it would be useful, I think, to bring it together in a coherent way to inform […]