When I organized the SIGIR 2009 Industry Track last year, my goal was to meet the standard set by the CIKM 2008 Industry Event: a compelling set of presentations that would give researchers an opportunity to learn about the problems most relevant to industry practitioners, and offer practitioners an opportunity to deepen their understanding of the field in which they are working. I was mostly happy with the results last year, and the popularity of the industry track relative to the parallel technical sessions suggest that my assessment is not simply from personal bias.
But this year the SIGIR 2010 Industry Track broke new ground. The keynotes were from some of the most senior technologists at the world’s largest web search engines:
- William Chang, Chief Scientist at Baidu
- Yossi Matias, Head of Google’s Israel R&D Center
- Jan Pedersen, Chief Scientist for Core Search at Microsoft (Bing)
- Ilya Segalovich, CTO and Co-Founder of Yandex
I won’t attempt to provide much detail about these presentations, first because
I’m hoping they will all be they have all been posted online and second because Jeff Dalton has already done an excellent job of posting live-blogged notes. Rather, I’ll offer a few reactions.
William’s presentation on the “Future Search: From Information Retrieval to Information Enabled Commerce” unsurprisingly focused on the Chinese search-related market. While the topic of Google in China was an elephant in the room, it did not surface even obliquely in the presentation–and I commend William for taking the high road. As for Baidu itself, its most interesting innovation is Aladdin, an open search platform that allows participating webmasters to submit query-content pairs.
Yossi’s presentation on “Search Flavours at Google” was a tour de force of Google’s recent innovations in the search and data mining space. The search examples most focused on the challenges of incorporating context into query understanding–where context might involve geography, time, social network, etc. But some of the more impressive examples showed off using the power of data to predict the present. More than anything, his presentation made clear that Google is doing a lot more than returning the traditional ten blue links.
Jan talked about “Query Understanding at Bing”. I really hope he makes these slides available, since they do a really nice job of describing a machine learning based architecture for processing search queries. To get an idea of this topic, check out Nick Craswell’s presentation from last year’s SIGIR.
Finally Ilya talked about “Machine Learning in Search Quality at Yandex”, the largest search engine in Russia. He described the main challenge in Russia as handling the local aspects of search: he gave as an example that, if you’re in a small town in Russia, then local results in Moscow may as well be on the moon. Local search is a topic close to my heart, not least of which because it is my day job! Ilya’s talked focused largely on Yandex’s MatrixNet implementation of learning to rank. What I’m surprised he didn’t mention is the challenges of data acquisition–in general, for domains beyond the web, obtaining high-quality data is often a much bigger challenge than filtering and ranking it.
All in all, the four keynotes collectively offered an excellent state-of-the-search-engine address. As with last year, the industry track talks were the most popular morning sessions, and the speakers delivered the goods.