SIGIR 2010: Day 2 Technical Sessions

On the second day of the SIGIR 2010 conference, I did start shuttling between sessions to attend particular talks.

In the morning session, I attended three talks. The first, “Geometric Representations for Multiple Documents” by Jangwon Seo and Bruce Croft, looks at the problem of representing combinations of documents in a query model. It provides both theoretical and experimental evidence that geometric means work better than arithmetic means for representing such combinations. The second, “Using Statistical Decision Theory and Relevance Models for Query-Performance Prediction” by Anna Shtok, Oren Kurland, and David Carmel, shows the efficacy of a utility estimation framework comprised of relevance models, measures like query clarity to estimate the representativeness of relevance models, and similarity measures to estimate the similarity or correlation between two ranked lists. The authors demonstrated significant improvements from the framework over simply using the representativeness measures for performance prediction. The third paper, “Evaluating Verbose Query Processing Techniques” by Samuel Huston and Bruce Croft, showed that removing “stop structures”, a generalization of stop words, could significantly improve performance on long queries. Interestingly, the authors evaluated their approach on “black box” commercial search engines Yahoo and Bing without knowledge of their retrieval models.

In the session after lunch, I mostly attended talks from the session on user feedback and user models. The first, “Incorporating Post-Click Behaviors Into a Click Model” by Feimin Zhong, Dong Wang, Gang Wang, Weizhu Chen, Yuchen Zhang, Zheng Chen, and Haixun Wang, proposed and experimentally validated a click model to infer document relevance from post-click behavior like dwell time that can be derived from logs. The second, “Interactive Retrieval Based on Faceted Feedback” by Lanbo Zhang and Yi Zhang, described an approach using facet values for relevance and pseudo-relevance feedback. It’s interesting work, but I think the authors should look at work my colleagues and I presented at HCIR 2008 on distinguishing whether facet values are useful for summarization or for refinement. The third, “Understanding Web Browsing Behaviors through Weibull Analysis of Dwell Time” by Chao Liu, Ryen White, and Susan Dumais, offered an elegant model of dwell time and used it to predict dwell time distribution from page-level features. Finally, I attended one talk from the session on retrieval models and ranking: “Finding Support Sentences for Entities” by Roi Blanco and Hugo Zaragoza. They present a novel approach of generalizing snippets to interfaces that offer named entities (e.g., people) as supplements to the search results. I am excited to see research that could make richer interfaces more explainable to users.

I spend the last session of the day listening to a couple of talks about users and interactive IR. The first was “Studying Trailfinding Algorithms for Enhanced Web Search” by Adish Singla, Ryen White, and Jeff Huang~~, turned out to be the best-paper winner~~. This work extends previous work that Ryen and colleagues have done on search trails and showed results of various trailfinding algorithms that outperform the trails users follow on their own. The second, “Context-Aware Ranking in Web Search” by Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, and Hang Li, analyzes requerying behavior as reformulation, specialization, generalization, or general association, and demonstrates that knowing or inferring which the user is doing significantly improves ranking of the second query’s results.

The day wrapped up with a luxurious banquet at the Hotel Intercontinental, near the Nations Plaza. After sweating through conference sessions without air conditioning, it was a welcome surprise to enjoy great food in such an elegant setting.