Categories
General

Google Search Appliance: Now Without HCIR!

In an earlier post, I speculated about why Google is holding back on faceted search. Of course, I was talking about their web search properties, not their enterprise offerings. I thought that they’d seen the light by now that faceted search–and HCIR in general–is especially important in the enterprise, where you can’t rely on PageRank, anchor text, and SEO–not to mention the large fraction of navigational and straight-to-Wikipedia queries.

But I was wrong. Don’t take it from me–watch the video below (or read this blog post) and listen to what Cyrus Mistry,  the product manager for the Google Search Appliance has to say. I might give him a pass on his dubious conflation all features other than ranked retrieval with “advanced search”. But here’s a direct quote: “users care about one thing: the right result coming to the top”.

Sigh. I don’t dismiss the value of relevance ranking. Some search queries are easy and clearly point to single documents as answers–and any search engine should do well on them. But lots of queries in site search and enterprise search environments (more so than on the web) don’t have a single best answer. That’s why we have faceted search and interfaces that offer useful information scent to users.

I understand that Google is, on the whole HCIR-averse. But I expect more from their enterprise division. To be clear, the “side by side” feature that Mistry touts is nice. It reminds me of Blind Search (built by a Microsoft employee in his spare time), and of a relevance ranking evaluator that Endeca customers have been using for years.

But there’s more to search results than ten blue links. Even the Google web folks seem to be slouching towards accepting the importance of interaction. Their enterprise team should be leading, not lagging.

By Daniel Tunkelang

High-Class Consultant.

14 replies on “Google Search Appliance: Now Without HCIR!”

Yes, but McDonald’s does claim that all people want is Big Macs. Mistry’s argument–and his pitch for the side-by-side tool– is that everything but the ranking of results is peripheral and thus shouldn’t play a significant role in how you evaluate an enterprise search solution. Perhaps he prefers that as the playing field for evaluating the GSA–that certainly seems to be the point of the video. But it’s a disservice to enterprises.

Regardless of the GSA’s strengths and weakness, there’s no reason for him to be dismissive of the broad range of mechanisms for supporting interaction and exploration. At best, it’s ignorant; at worst, it’s shamelessly partisan.

Like

Well do you think that the reason Google is refraining from HCIR is because they suspect that it will drive away users who are used to 10 blue links? I am sure that such users are plenty and they don’t want to give up on 10 blue links, just like they are not giving up on IE6.

Like

If I take Mistry at his words, it’s that he thinks none of that stuff matters, only the result ranking. Perhaps he is catering to a conservative user base. But at least in site search, faceted search has become the standard–as demonstrated by Amazon and eBay, not to mention countless Endeca customers. I know people at Google who acknowledge that. That’s why I’m mystified by Mistry’s dismissiveness.

Like

No, Daniel T, you’re not gettin’ it. Google is taking the role of Voltaire’s Dr. Pangloss. Google saying that ranked ten links is all for the best, in the best of all possible worlds. Who are we to question their optimism?

Aw, forget it. I’m gonna go cultivate my own garden. 😉

Like

The Side-by-side comparison tool has been proposed by Thomas and Hawking in CIKM 2006 as an important mechanism to evaluate enterprise search engine.

They have also a position paper in the “Future of IR” workshop in SIGIR’09 about the new C_TEST framework for evaluating/tuning enterprise search engines.

Like

Jeremy, I should know you’d be perfectly Candide with me.

Iadh, I don’t doubt the utility of the mechanism, nor do I fault Mistry for not citing previous work in a press release–though you reinforce that this approach is hardly innovative. Rather, my concern is that he treats the tool’s limitation as a virtue–rather than acknowledging the loss of fidelity in a tool that ignores everything but result ranking, he says that nothing else actually matters to users.

Like

Google announced parametric/faceted search in their v5 release almost 2 years ago. At that point everybody assumed that the GSA supported it – including their current sales team 😉 – but now that v6 is out, it just completely disappeared from their feature list.

The reason, as I mentioned it in a comment to your previous post, is that they never actually supported it. The press release was just based on an unsupported experimental feature that calculates facets in JavaScript out of the top 100 results….

No wonder they are now dissing the feature… that’s smart product management!

Like

The press release was just based on an unsupported experimental feature

Press releases for unsupported (non-existent?) features? Chrome OS announced 1.5 years ahead of time, while there is nowhere near an actual product?

Is Google turning in to Microsoft, with this new vaporware strategy? Google ’09 = Microsoft ’95?

Like

Jeremy, I should know you’d be perfectly Candide with me.

No, think about it! The prefix “pan-” is a combining form that means all or everything. That’s very similar to the motivation behind the name Google, which is so large that it might as well be everything. And they want to index all the worlds information, too.

And “gloss”.. what is something that often conjures thoughts of gloss, of bright and shiny surfaces. That’s right. Chrome. Chrome is glossy.

So there you have it. Pangloss = Google Chrome. It fits. Oh, and Google hires tons of PhDs. They make a big deal out of that. Dr. Pangloss.

And don’t forget Dr. Pangloss’s “optimistic” attitude. Everything is the way that it is, because what is, is the best of all possible outcomes in all possible worlds. In other words, the Googly way to do things is the best and only way to do things. You will take your ranked ten links and love it.

Now excuse me while I go “pangloss” some information on the web.

Like

Maybe its a security-related decision.
I’ve seen their security model and it provides for call-outs to the content source system (sharepoint, filesystem etc) and checks the user authorisation on just these top results.
With no local cache of all user/document access permissions in GSA it is inefficient to call-out to the source’s security system when aggregating facet totals for a query – that could be millions of document security checks.

Like

Comments are closed.