The Noisy Channel

 

Search Innovation: Why Can’t We All Just Get Along?

June 26th, 2009 · 5 Comments · General

It’s unusual for HCIR to make it into the mainstream business press, so I was delighted when Pete Barlas reached out to me in connection with an article he published Wednesday in Investor’s Business Daily, entitled “Bing Feature Has Many Fathers; Rivals Lining Up To Take Credit“.

The genesis for the article was a dispute between Microsoft and Hakia. Hakia’s chief operating officer, Melek Pulatkonak, claims that Bing copied Hakia’s “galleries” features:

“We were approached by Microsoft to show them how the Hakia galleries worked, and we did, and now they have a similar feature — we showed them how to do it,” she said. “We were surprised that it is a featured part of and the most differentiated part of Bing.”

I like the folks at Hakia (I blogged about them a while ago), but here I think they’re over-reacting, at least. The idea of using query refinement to help users focus queries certainly predates both companies, and Hakia, by its own admission, is a relative newcomer to the scene, having launched in 2006.

But the story doesn’t end there. Barlas received a statement from Microsoft claiming that Bing implements faceted search. That’s true for some parts of the site, but it’s feels like a half-truth. Bing’s general web search offers search suggestions, but does not implement faceted search.

The plot thickens. Vivisimo‘s chief scientist, Jerome Presenti, claims that his company was “really the first one to provide a broad categorized search”. But, as Barlas points out, what Vivisimo offers is clustering, which is neither categorization (at least some of us make a sharp distinction between supervised categorization into predetermined categories and unsupervised clustering) nor faceted search. Marti Hearst offers a good analysis (including a critique of Vivisimo’s Clusty.com) in “Clustering versus faceted categories for information exploration“.

I take some of the credit for explaining these distinctions to Barlas, and he got it–though I’m sure some of the credit is due to others he talked with, including IDC analyst Sue Feldman and Danny Sullivan, editor-in-chief of Search Engine Land.

Squabbling among vendors makes for good press, and there’s a legitimate business interest when companies start threatening each others with lawsuits, as Hakia has said it’s considering. And there’s certainly room for arguments over who has a better approach or implementation.

But let’s–and here I speak as someone who often represents Endeca in these discussions–at least agree to standardize on basic terms that have now been around for a while, like categorization, clustering, and faceted search. There’s enough of a vocabulary problem for our users; let’s not cultivate one in our press relations and legal posturing.

5 responses so far ↓

  • 1 Jurn // Jun 28, 2009 at 3:15 pm

    The browser ‘interviews’ the user on install (50 questions?), maybe also trawls their bookmarks for keywords, maybe even harvests their Facebook page. Then it embeds a blended personal profile in the browser. When searching the web, the browser then reconfigures and optimises any vague search terms seamlessly, before they even hit the engine. No need for the poor user to get RSI clicking through endless facets, clusters and drop-down suggestion menus for each and every search query they make.

  • 2 Daniel Tunkelang // Jun 28, 2009 at 4:14 pm

    I’m all for smarter browsers–I love the “awesome” bar on Firefox, and I’ve heard that Chrome users enjoy similar features. But I’m skeptical about a browser that “reconfigures and optimises any vague search terms seamlessly”. I like transparency and control, and the last thing I want is my browser second-guessing my intentions.

    But to each his or her own. My point is not to criticize the various ways that people are trying to innovate, but rather to suggest that we’d be well served by using a consistent vocabulary to describe the various technical approaches to search–and by being cognizant of their history.

  • 3 Jerome Pesenti // Jun 29, 2009 at 1:22 pm

    You are right that there is some confusion about terminology but Bing’s approach on their web results is actually an interesting mix of things. It offers query suggestions but also groups the results for some queries (try nike) which is neither faceting nor pure clustering.

    With regard to “taking credit”, my only intent was to say that organizing search results in categories/clusters/facet is far from a new idea which makes Hakia’s claim somewhat odd. I believe that Vivisimo was the first company to offer commercially search results grouping of any kind (we sold and demonstrated the technology on vivisimo.com as early as June 2000, Endeca was created in 1999 but only launched in 2001 ). In no way we actually claim to have invented the idea, early prototypes had been created by Marti Hearst herself as well as Oren Etzioni’s team who had a live site running in 1999.

  • 4 Daniel Tunkelang // Jun 29, 2009 at 1:41 pm

    Jerome, great to see you here, and good point re: Bing’s grouping of results. You’re right that Bing’s inclusion of previews of the query suggestions in the main results is a kind of categorization or clustering (I can’t tell if its supervised or unsupervised). But, as we agree, it isn’t faceted search.

    As for credit, I think we’re also agreeing that techniques for organizing search results have been around for a while. For example, Northern Light was around before both Endeca, Vivisimo, Hakia, and Bing. And of course Scatter/Gather work from PARC is even earlier, though I don’t know if any of it made it into commercial products.

    In any case, I think it’s silly for anyone other than the lawyers to fight over credit, and I think that even the lawyers will have a tough time with the nuances.

  • 5 Jerome Pesenti // Jun 29, 2009 at 2:31 pm

    You are right NorthernLight came in earlier in 1997 (and they still seem to be around!). They were doing search results faceting on top of pre-defined categories. They even claimed that search results clustering was infringing on their patent which it couldn’t given Marti’s prior work. Not only people claim credit all around but there are lots of crazy patents in that area…

Clicky Web Analytics