The Noisy Channel


SIGIR 2010: Day 1 Technical Sessions

July 21st, 2010 · 3 Comments · General

I’ve always felt that parallel conference sessions are designed to optimize for anticipated regret, and SIGIR 2010 is no exception. I decided that I’d try to attend whole sessions rather than shuttle between them. I started by attending the descriptively titled “Applications I” session.

Jinyoung Kim of UMass presented joint work with Bruce Croft on “Ranking using Multiple Document Types in Desktop Search” in which they showed that type prediction can significantly improve known-item search performance in simulated desktop settings. I like the approach and result, but I’d be very interested to see how well it applied to more recall-oriented tasks.

Then came work by Googlers Enrique Alfonseca, Marius Pasca, and Enrique Robledo-Arnuncio on “Acquisition of Instance Attributes via Labeled and Related Instances” that overcomes the data sparseness of open-domain attribute extraction by computing relationships among instances and injecting this relatedness data into the instance-attribute graph so that attributes can be propagated to more instances. This is a nice enhancement to earlier work by Pasca and others on obtaining these instance-attribute graphs.

The session ended with an intriguing paper on “Relevance and Ranking in Online Dating Systems” by Yahoo researchers Fernando Diaz, Donald Metzler, and Sihem Amer-Yahia that formulated a two-way relevance model for matchmaking systems but unfortunately found that it did no better than query-independent ranking in the context of a production personals system. I would be very interested to see how the model applied to other matchmaking scenarios, such as matching job seekers to employers.

After a wonderful lunch hosted by Morgan & Claypool for authors, I attended a session on Filtering and Recommendation.

It started with a paper on “Social Media Recommendation Based on People and Tags” by IBM researchers Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, and Erel Uziel. They analyzed item recommendation in an enterprise setting and found that a hybrid approach combining algorithmic tag-based recommendations with people-based recommendations achieves better performance at delivering interesting recommendations than either approach alone. I’m curious how well these results generalize outside of enterprise settings–or even how well they apply across the large variation in enterprises.

Then came work by Nikolaos Nanas, Manolis Vavalis, and Anne De Roeck on “A Network-Based Model for High-Dimensional Information Filtering”. The authors propose to overcome the “curse of dimensionality” of vector space representations of profiles by instead modeling keyword dependencies in a directed graph and applying a non-iterative activation model to it. The presentation was excellent, but I’m not entirely convinced by the baseline they used for their comparisons.

After that was a paper by Neal Lathia, Stephen Halles, Licia Capra, and Xavier AmatriainĀ on “Temporal Diversity in Recommender Systems”. They focused on the problem that users get bored and frustrated by recommender systems that keep recommending the same items over time. They provided evidence that users prefer temporal diversity of recommendations and suggested some methods to promote it. I like the research, but I still think that recommendation engines cry out for transparency, and that transparency can also help address the diversity problem–e.g., pick a random movie the user watched and propose recommendations explicitly based on that movie.

Unfortunately I missed the last paper of the session, in which Noriaki Kawamae talked about “Serendipitous Recommendations via Innovators”.

Reminder: also check out the tweet stream with hash tag #sigir2010.

3 responses so far ↓

  • 1 Dinesh Vadhia // Jul 22, 2010 at 5:10 am

    @ daniel, re: Recommendation engines cry out for transparency

    Re-read the original post on this and still not sure that I understand where you’re coming from. For example, if the last movie I watched (with the kids!) was “lilo and stitch” and we enter it into a movie recommendation engine I’d expect to find similar movies to that one.

    Are you saying that this is not what today’s recommendation engines do?

  • 2 Andy // Jul 22, 2010 at 9:58 am

    How did the IBM researchers combine and rank the results from the two recommender models (tag-based and people-based)?

  • 3 Daniel Tunkelang // Jul 25, 2010 at 9:25 am

    Dinesh, content-based recommendations of movies based on similarity to a single movie are probably intuitive enough for users not to require a reductionist explanation of the similarity factors. That’s much less true for recommendations based on a set or sequence of movies, especially when the recommendations aren’t just content-based (e.g., collaborative filtering).

    Andy, you can read the full paper here if you have access to the ACM Digital Library. I wish authors would be more be more consistent about posting their papers on their own web sites!

Clicky Web Analytics