RecSys 2012: Beyond Five Stars

Post author By Daniel Tunkelang
Post date September 14, 2012
7 Comments on RecSys 2012: Beyond Five Stars

I spent the past week in Dublin attending the 6th ACM International Conference on Recommender Systems (RecSys 2012). This young conference has become the premier global forum for discussing the state of the art in recommender systems, and I’m thrilled to have has the opportunity to participate.

Sunday: Workshops

The conference began on Sunday with a day of parallel workshops.

I attended the Workshop on Recommender Systems and the Social Web, where I presented a keynote entitled “Content, Connections, and Context“. Major worktop themes included folksonomies, trust, and pinning down what we mean by “social” and “context”. The most interesting presentation was “Online Dating Recommender Systems: The Split-complex Number Approach“, in which Jérôme Kunegis modeled the dating recommendation problem (specifically, the interaction of “like” and “is-similar” relationships) using a variation of quaternions introduced in the 19th century! The full workshop program, including slides of all the presentations is available here.

Unfortunately, I was not able to attend the other workshops that day, which focused on Human Decision Making in Recommender Systems, Context-Aware Recommender Systems (CARS), and Recommendation Utility Evaluation (RUE). But I did hear that Carlos Gomez-Uribe delivered an excellent keynote at the RUE workshop on the challenges of offline and online evaluation of Netflix’s recommender systems.

Monday: Experiments, Evaluations, and Pints All Around

Monday started with parallel tutorial sessions. I attended Bart Knijnenburg‘s tutorial on “Conducting User Experiments in Recommender Systems“. Bart is an outstanding lecturer, and he delivered an excellent overview of the evaluation landscape. My only complaint is that there was too much material for even a 90-minute session. Fortunately, his slides are online, and perhaps he’ll be persuaded to expand them into book form. Unfortunately, I missed Maria Augusta Nunes and Rong Hu‘s parallel tutorial on personality-based recommender systems.

Then came a rousing research keynote by Jure Leskovec on “How Users Evaluate Things and Each Other in Social Media“. I won’t try to summarize the keynote here — the slides of this and other presentations are available online. But the point Jure made that attracted the most interest was that voting is so predictable that results are determined mostly by turn-out. Aside from the immediate applications of this observation to the US presidential elections, there are many research and practical questions about how to obtain or incent a representative participant pool — a topic I’ve been passionate about for a long time.

The program continued with research presentations on multi-objective recommendation and social recommendations. I may be biased, but my favorite presentation was the work that my colleague Mario Rodriguez presented on multiple-objective optimization in LinkedIn’s recommendation systems. I’ll post the slides and paper here as soon as they are available.

Monday night, we went to the Guinness Storehouse for a tour that culminated with fresh pints of Guinness in the Gravity Bar overlooking the city. We’re all grateful to William Gosset, a chemist working for the Guinness brewery when he introduced the now ubiquitous t-test in 1908 as a way to monitor the quality of his product. A toast to statistics and to great beer!

Tuesday: Math, Posters, and Dancing

Tuesday started with another pair of parallel tutorial sessions. I attended Xavier Amatriain‘s tutorial on “Building Industrial-scale Real-world Recommender Systems” at Netflix. It was an excellent presentation, especially considering that Xavier had just come from a transatlantic flight! A major theme in his presentation was that Netflix is moving beyond the emphasis on user ratings to make the interaction with the user more transparent and conversational. Unfortunately I had to miss the parallel tutorial on the “The Challenge of Recommender Systems Challenges” by Alan Said, Domonkos Tikk, and Andreas Hotho.

Tuesday continued with research papers on implicit feedback and context-aware recommendations. One that drew particular interest was Daniel Kluver’s information-theoretical work to quantify the preference information contained in ratings and predictions, measured in preference bits per second (paper available here for ACM DL subscribers). And Gabor Takacs had the day’s best line with “if you don’t like math, leave the room.” He wasn’t kidding!

Then came the posters and demos — first a “slam” session where each author could make a 60-second pitch, and then two hours for everyone to interact with the authors while enjoying LinkedIn-sponsored drinks. There were lots of great posters, but my favorite was Michael Ekstrand‘s “When Recommenders Fail: Predicting Recommender Failure for Algorithm Selection and Combination“.

Tuesday night we had a delightful banquet capped by a performance of traditional Irish step dancing. The dancers, girls ranging from 4 to 18 years old, were extraordinary. I’m sorry I didn’t capture any of the performance on camera, and I’m hoping someone else did.

Wednesday: Industry Track and a Grand Finale

Wednesday morning we had the industry track. I’m biased as a co-organizer, but I heard resounding feedback that the industry track was the highlight of the conference. I was very impressed with the presentations by senior technologists at Facebook, Yahoo, StumbleUpon. LinkedIn, Microsoft, and Echo Nest. And Ronny Kohavi‘s keynote on “Online Controlled Experiments: Introduction, Learnings, and Humbling Statistics” was a masterpiece. I encourage you to look at the slides for all of these excellent presentations.

Afterward came the last two research sessions, which included the best-paper awardee “CLiMF: Learning to Maximize Reciprocal Rank with Collaborative Less-is-More Filtering“. I’ve been a fan of “less is more” ever since seeing Harr Chen present a paper with that title at SIGIR 2006, and I’m delighted to see these concepts making their way to the RecSys community. In fact, I saw some other ideas, like learning to rank, crossing over from IR to RecSys, and I believe this cross-pollination benefits both fields. Finally, I really enjoyed the last research presentation of the conference, in which Smriti Bhagat talked about inferring and obfuscating user demographics based on ratings. The technical and ethical facets of inferring private data are topics close to my heart.

Finally, next year’s hosts exhorted this year’s participants to come to Hong Kong for RecSys 2013, and we heard the final conference presentation: Neal Lathia’s 100-euro-winning entry in the RecSys Limerick Challenge.

Thursday: Flying Home

Sadly I missed the last day conference-related activity: the doctoral symposium, the RecSys Data Challenge, and additional workshops. I’m looking forward to seeing discussion of these online, as well as reviewing the very active #recsys2012 tweet stream.

All in all, it was an excellent conference. LinkedIn, Netflix, and other industry participants comprised about a third of attendees, and there was a strong conversation bridging the gap between academic research and industry practice. I appreciate the focus of the nuances of evaluation, particularly the challenges of combining offline evaluation with online testing, and ensuring that the participant pool is robust. The one topic where I would have like to see more discussion was that of creating robust incentives for people to participate in recommender systems. Maybe next year in Hong Kong?

Oh, and we’re hiring!

By Daniel Tunkelang

High-Class Consultant.

View Archive

7 replies on “RecSys 2012: Beyond Five Stars”

[…] RecSys 2012: Beyond Five Stars by Daniel Tunkelang. […]

LikeLike

[…] RecSys 2012: Beyond Five Stars (The Noisy Channel) […]

LikeLike

“A major theme in his presentation was that Netflix is moving beyond the emphasis on user ratings to make the interaction with the user more transparent and conversational. ”

As we’ve discussed for years, I really believe that conversationalism is the way forward. But for just as many years, the IR and the RecSys communities have been stuck in implicit feedback, nonconversational mode.

Do you see this starting to change? Or is it still just one or two talks here or there on explicit interaction, with the majority of the community still focused on implicit?

“In fact, I saw some other ideas, like learning to rank, crossing over from IR to RecSys, and I believe this cross-pollination benefits both fields. ”

Actually, let me be a bit provocative and claim that, when these systems move more into conversational and other explicit interaction approaches, there really isn’t a difference between IR and RecSys. I know, I know. The RecSys folks like to be very loud about how what they’re doing is not IR.

But think about what conversational, explicit IR is all about. You enter a query. You get results: both in the forms of documents and query suggestions. You give explicit (relevance) feedback on either/both. Which gives you more documents and more query suggestions, and so on.

What’s a conversational, recommender system? You enter a document as a query. You get results: both in the forms of documents and query suggestions. You give explicit (relevance) feedback on either/both. Which gives you more documents and more query suggestions, and so on. (Documents, of course, is just shorthand for objects of interest in the domain.. e.g. books, music, movies, etc.)

The only difference is that RecSys started with “item as query” and IR started with “feature as query”. But once they got into the conversational loop, I do not see a difference between the two, conceptually. One is feature->item->feature->item->…. The other is item->feature->item->feature->item->…

So, what I’m trying to say is that it is great that RecSys is borrowing from IR. But, at the end of the day, if both RecSys and IR continue to be come explicitly conversational, then RecSys is IR. And vice versa.

Non?

LikeLike

The only difference is that RecSys started with “item as query” and IR started with “feature as query”.

…and sometimes, the recommender doesn’t even start with item as query. For example, you can seed your Pandora radio station recommender by running a text query for an artist or song. Or you can start your Netflix movie recommendation by doing a text search for one movie that you like, and going from there.

RecSys is classic IR! It might not be NDCG web IR. But look back 15 years into what IR is/was all about.. the relevance feedback, the interaction, etc. And the recognize that even RecSys often starts with an “ad hoc” query.

LikeLike

I don’t see a sharp line between RecSys and IR — in fact, I’m used to thinking of RecSys as a special case of IR, something I tried to avoid saying at the RecSys conference. 🙂

As you put it, the only question is the starting point. If the user explicitly provides a clear signal of intent, we tend to call it IR. If the user does not provide any explicit signal of intent, we tend to call it RecSys. Everything in between in a gray area.

LikeLike