Categories
General

SIGIR: Meet the Who’s Who of Search and Information Retrieval

Matt Cutts. danah boyd. Bruce Croft. Marti Hearst. What do these people have in common? If you’re thinking that they are some of the biggest names in the research and practice of search and information retrieval, then you have at least part of the answer. For full credit, the answer is that they are some of the people who will be presenting during the SIGIR Industry Track on Wednesday, July 22nd at the Sheraton Boston Hotel.

There have been some changes from the original program. As noted above, Bruce Croft and Marti Hearst are now participating. They will offer research counterpoints to the panels of industry practitioners (vendors and analysts). Autonomy bowed out of the vendor panel; instead, we’re including Raul Valdes-Perez, the executive chairman (and founder) of Vivisimo. We also received regrets from the Open Calais folks at Thomson Reuters; instead, we’ll hear from Evan Sandhaus, semantic technologist at the New York Times.

Here is the full list of participants, in order of appearance:

If you’re registered for the full SIGIR conference, then you are entitled to attend any or all of the Industry Track at no additional charge.

Otherwise, there’s also a one-day registration option for people only interested in attending the Industry Track. The cost for that one-day option is $350 (half of that for students). I should be able to get you the early registration rate of $300 if you contact me as soon as possible, and I can try to negotiate a group rate if a company want to send at least four people.

If you live in the Boston area and are interested in search and information retrieval (or you know people who are), this is an incredible opportunity to see the worlds of research and practice come together. I’m excited to be organizing it, and looking forward to attending it. If you have not yet registered but are interested in attending, please let me know ASAP, and I’ll see what I can do.

Categories
General

Design For Interaction: My SIGMOD Slides

These are the slides I presented at SIGMOD a couple of weeks ago. The animation on a few of the slides doesn’t come through on SlideShare, but you can always download the PowerPoint if you are so inclined. The other talks in the invited session on Human-Computer Interaction with Information, by Ed Chi and Jeff Heer, were fantastic, as was the joint keynote on visualization by Fernanda Viégas and Martin Wattenberg.

Categories
Uncategorized

Faceted Search Book Is Shipping

Amazon and Barnes & Noble are both shipping the faceted search book, so hopefully all of our pre-orders are finally leaving the warehouses. My apologies for the delays, and my thanks to everyone who has been so patient. Apparently a few people had trouble using the publisher’s site; at this point I suggest using Amazon or BN, since both offer competitive prices.

Categories
General

Catching Up On Last Week’s News

I hope everyone had a great week! It looks like I missed some interesting / controversial stories in the tech news / blogosphere, the most notable being:

Quick reactions:

Regarding the anti-SQL movement, I would have thought the main complaint would be that SQL is too arcane a language for ordinary users to ever use it directly. Instead, the article discusses developers’ complaints about databases, and these are mostly about price, speed, and scale. Evidently even free, open-source databases like MySQL are losing favor relative to tools like Hadoop and Hypertable that don’t offer support for SQL. Of course, this picture comes from a meetup of 150 people that might not be entirely representative of information technology workers.

I know first-hand from my experience at Endeca that, to quote Michael Stonebraker, the “one size fits all” approach to databases is an idea whose time has come and gone. At Endeca, we have built our own special-purpose database to address information needs ill-served by the available OLTP and OLAP technologies. Still, I think it’s premature to declare the death of SQL or of relational databases. But why let that stand in the way of a good story?

On to the open-source search engine comparison. I won’t rehash the critique of the study, which you can find in the 80+ comments from folks like Jeff Dalton, Bob Carpenter, and Otis Gospodnetic. Perhaps the most salient point is that it’s not clear how much sense it makes to perform “out of the box” evaluations. In any case, my impression is that Lucene is by far the dominant player in the open-source search space; the study, if it has any effect, will only be to reinforce that dominance.

And finally, the big news from the big G: a Google Operating System. Even my mom (who couldn’t name an existing operating system) was asking me about it, so clearly this one has made it into the mainstream media. And yet I don’t see why this is such a big deal. We have netbooks, and we even have Linux-based netbooks. As far as I’ve heard, the latter are popular with geeks and cheapskates, but that’s about it–most people are willing to pony up the few extra dollars for Windows XP. Will Google launching a netbook-oriented OS significantly affect this market? I suspect the only route to success is if they meet non-technical users’ needs (browsing, email, media, light document editing) while minimizing their overhead (maintenance, security, compatibility). Will they be a better Ubuntu? Perhaps, much in the way that Chrome is trying to be a better Firefox. Why Google choose to build its own free, open-source products rather than contribute to mature open-source projects is a mystery to me, but it’s their money and time to spend.

I think that cover’s the week’s big stories–or at least those that matter most to Noisy Channel readers. Somehow I didn’t manage to come up with an IR / HCIR angle on the Michael Jackson story, or perhaps it’s just that Danny Sullivan beat me to it.

Anyway, I’m back in the saddle, and should soon be back to my normal posting volume. Thank you all for being patient.

Categories
Uncategorized

Taking Time Off

I’ll be offline for about a week, returning on July 13th. No, I’m not going to Argentina or even hiking the Appalachian trail, but I am going off the grid to spend quality time with my wife and daughter. See you all soon!

Categories
General

The Wild World of SIGMOD

I’m on my way home from SIGMOD 2009, my first experience attending a conference on databases. Actually, it was a my first experience attending two conferences on databases, since SIGMOD was held in Providence concurrently with PODS.

Ed Chi, Jeff Heer, and I were invited to SIGMOD for a session in which we shared our perspectives with the database community on Human-Computer Interaction with Information. Yes, database people care about HCIR too! As the SIGMOD organizers correctly pointed out, people interested in HCI us don’t often show up at database conferences, and I am both grateful and impressed that they took the intiative to remedy that. In a similar spirit, they invited Martin Wattenberg and Fernanda Viégas to deliver a joint keynote about visualization. Even for those of us who were already familiar with their Many Eyes work, it was a delightful presentation.

Of course, it was a great opportunity for me to learn what database people normally worry about. The conference opened with a kaynote by Hasso Plattner, co-founder of software giant SAP. The main take-away of his presentation was that column stores and multi-core computation have improved the efficiency of databases by at least two orders of magnitude, opening a new world of possibilities in information access.

Column stores are pretty hot in this community. I didn’t make it to the research session devoted to them (and which included the paper that received the best-paper award), but I did get to attend the presentation that has attracted the most attention outside SIGMOD, “A Comparison of Approaches to Large-Scale Data Analysis“, a paper by seven authors that compares Hadoop (the open-source implementation of Google’s MapReduce approach), an unspecified commercial row-storage (i.e., conventional) relational database, and the Vertica column-store databases. MIT Professor Sam Madden did the presentation, but the author most indentified with this work is probably Michael Stonebraker. Indeed, Madden had a number of slides where he asked WWMSS? (“What would Mike Stonebreaker say?”), with pithy quotes like “Hadoop is ‘go slow’ for OLAP.” Madden delivered an excellent presentation, but his analysis, which was less than favorable to Hadoop, did rile up some of the audience. Specifically, Berkeley professor Joe Hellerstein suggested that the comparisons were “using the wrong y-axis” by comparing the approaches based on processing time. It would indeed be interesting to compare the development time that was required to use each of the tool the authors compared.

Some other talks I attended and enjoyed:

I also saw two really nice demos:

All in all, I enjoyed three fun and intellectually stimulating days, complete with great food and a harbor cruise in Newport. I’m grateful to the SIGMOD organizers for the invitation to spend a few days in their world, and look forward to integrating what I learned here into my own work.