I’ve always felt that parallel conference sessions are designed to optimize for anticipated regret, and SIGIR 2010 is no exception. I decided that I’d try to attend whole sessions rather than shuttle between them. I started by attending the descriptively titled “Applications I” session.
Jinyoung Kim of UMass presented joint work with Bruce Croft on “Ranking using Multiple Document Types in Desktop Search” in which they showed that type prediction can significantly improve known-item search performance in simulated desktop settings. I like the approach and result, but I’d be very interested to see how well it applied to more recall-oriented tasks.
Then came work by Googlers Enrique Alfonseca, Marius Pasca, and Enrique Robledo-Arnuncio on “Acquisition of Instance Attributes via Labeled and Related Instances” that overcomes the data sparseness of open-domain attribute extraction by computing relationships among instances and injecting this relatedness data into the instance-attribute graph so that attributes can be propagated to more instances. This is a nice enhancement to earlier work by Pasca and others on obtaining these instance-attribute graphs.
The session ended with an intriguing paper on “Relevance and Ranking in Online Dating Systems” by Yahoo researchers Fernando Diaz, Donald Metzler, and Sihem Amer-Yahia that formulated a two-way relevance model for matchmaking systems but unfortunately found that it did no better than query-independent ranking in the context of a production personals system. I would be very interested to see how the model applied to other matchmaking scenarios, such as matching job seekers to employers.
It started with a paper on “Social Media Recommendation Based on People and Tags” by IBM researchers Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, and Erel Uziel. They analyzed item recommendation in an enterprise setting and found that a hybrid approach combining algorithmic tag-based recommendations with people-based recommendations achieves better performance at delivering interesting recommendations than either approach alone. I’m curious how well these results generalize outside of enterprise settings–or even how well they apply across the large variation in enterprises.
Then came work by Nikolaos Nanas, Manolis Vavalis, and Anne De Roeck on “A Network-Based Model for High-Dimensional Information Filtering”. The authors propose to overcome the “curse of dimensionality” of vector space representations of profiles by instead modeling keyword dependencies in a directed graph and applying a non-iterative activation model to it. The presentation was excellent, but I’m not entirely convinced by the baseline they used for their comparisons.
After that was a paper by Neal Lathia, Stephen Halles, Licia Capra, and Xavier Amatriain on “Temporal Diversity in Recommender Systems”. They focused on the problem that users get bored and frustrated by recommender systems that keep recommending the same items over time. They provided evidence that users prefer temporal diversity of recommendations and suggested some methods to promote it. I like the research, but I still think that recommendation engines cry out for transparency, and that transparency can also help address the diversity problem–e.g., pick a random movie the user watched and propose recommendations explicitly based on that movie.
Unfortunately I missed the last paper of the session, in which Noriaki Kawamae talked about “Serendipitous Recommendations via Innovators”.
Reminder: also check out the tweet stream with hash tag #sigir2010.