Faceted Web Search?

Researchers from Microsoft say it’s very challenging. Google is trying, but there’s a long way to go. And Eric Iverson just wrote me to describe his own preliminary efforts to build faceted search on top of Yahoo! BOSS.

I believe there’s a clearly established business case for faceted search inside the enterprise, for site search (especially for retail and media / publishing sites), even for vertical search on the open web. In all of these cases, relevance-ranked results are insufficient to meet a large subset of users’ more exploratory information needs, and HCIR approaches like faceted search are an easy sell.

But it seems much harder to make this case for general web search. The track record of startups in this space isn’t very encouraging. That could be because no one has done it right, but Clayton Christensen’s theory of disruptive innovation would suggest that a successful entrant wouldn’t have to have parity across the board, but would simply need to win on an underserved market segment. Perhaps the increasing use of faceted search for vertical search is how this process is playing out, and faceted search for general web search may end up being a slow agglomeration of verticals.

I’m curious if others have been pursuing efforts like Eric’s. Are the available APIs powerful enough to prototype your own faceted web search engine? If they aren’t, then is this a potential business opportunity for one of the major (or non-major) search engines to promote innovation by offering an open system? Or, if Yahoo! BOSS already offers such an open system, what should we make of the scale of its impact?

By Daniel Tunkelang

High-Class Consultant.

13 replies on “Faceted Web Search?”

Definitely have thought about doing something like faceted search using yahoo BOSS after we built

I’m also not sure that it is a wining formula. I believe for expoloratory search you need social signals, as it works better than machine extracted metadata.


Nice post.

I was surprised you didn’t mention Bing’s efforts with their query categorization system. I think they’ve been leading here.

I’m not sure that I would classify Google Squared as faceted search. I think the attempts with “show options” that add time, media type, etc… was an important step.

One of the key reasons for my research interest in vertical search is the ability to apply faceted search. For many verticals, the facets are based on aspects of the content useful to support a specific set of tasks. For web search, a better understanding of common tasks would help define useful facets. The challenge is tackling the diversity of tasks and topics.


Ed, I’m curious what sort of social signals you have in mind. Does anchor text count?

Jeff, thanks. As for Bing, I didn’t think it was pursuing faceted search, beyond the Bing Visual Search that I have blogged about but which doesn’t strike me as more than a demo at this point. Query categorization may well be a key enabler for faceted search on the web, but I don’t feel Bing has connected those dots yet.

I do think that Google’s “tool belt” (aka “show options”) offers faceted search, but I’m looking for more interesting facets than time and media type. And, while you’re right that Squared’s interface doesn’t use the facets it presents as refinement options, it does present results using a faceted classification scheme. It’s a relatively small step to make the presented values act as refinements–the real issue here is improving data quality.

In any case, I think we’re agreeing that faceted search for general web search may really just be a matter of accumulating verticals, perhaps stretching the concept of a vertical a bit to include task focus rather than content focus. Indeed, “general web search” may someday be the last resort when we can’t better constrain the nature of the user’s information need.


Great post, Daniel! I was very encouraged by Google’s “Show options” feature when it came out – though I may be it’s biggest fan (I use it for about 90% of my Google searches). Sadly, most people I talk to don’t even know it exists. I think what’s happened is that the “OneBox” mantra has been pushed so hard with users that the idea of metadata about the search results as a whole has been made to seem too complex or counter-intuitive.

As for the theory of disruptive innovation, it doesn’t apply for faceted Web search because of the nature of the search market – it can only be broken into if (a) Google buys the technology, or (b) a very large competitor comes along (i.e. Microsoft with Bing). Hence IMHO it is Google’s responsibility to tackle that diversity of tasks and topics as you so eloquently suggest, and then provide this much-needed innovation in ways that are simple and intuitive to users – i.e. by making it part of the core search, not just relegated to a small link on the results page.


If “clustering” SEs count as faceted ones then there have been a number of attempts arguably successful but not very deep. Bing is trying because it can and Google couldnt risk because of Adwords, I think.


Eric, I agree that it’s an uphill battle to get users to try new things, and it doesn’t help that the placement of the “Show options” feature is so conservative that most users never even notice it. I’d say the toolbelt feature is primarily aimed at early adopters (like you!).

As for the potential for disruptive innovation, I’d certainly like to see Google at the forefront, but I’m curious as to whether the innovation should occur inside of a major search engine company or outside, i.e., treating the major search engines as platforms as BOSS seems to be promoting as an approach. And I wouldn’t rule out completely independent innovators–as my colleagues constantly remind me, competition is only a click away.

Lee, I don’t count clustering as faceted search, though it’s certainly a related approach. Regardless, I’m not persuaded that any of the clustering approaches to web search have been successful–I’m curious which attempts you have in mind.

As for Google and Bing, both are trying to promote some amount of interactivity in the interface. Bing’s main distinctions, as I see it, are optimizing more for the “short snout” and making its interaction features more prominent in the interface. I don’t see any of this being a risk to the ad-supported model–obtaining more signal from users would increase the accuracy of ad targeting. I think the real issue is setting up user expectations that can’t be met–the problem is data quality.


Yes, I think Anchor Text can count as social signposts, since they’re created by others to point to a link. The text are what they consider to be the ‘signals’ of that destination page. Think of it as ‘information scent’.


Sorry for late posting. Clusty is doing okay work if you think their size and generally offers good breakdown into related topics. I agree with you that it is more like browsing than faceted searching.


We’ve been experimenting with broad brush faceted search over web content on our b2b vertical search engine for a while now – take a look (facets include topic, content type, source, company etc). I’d agree that this is very complex, but that the opportunity is huge.


Anthony, thanks for the link. It’s an interesting site. Would be great if you could share some details about the design and implementation — or better yet what you’ve learned from users.


Comments are closed.