Categories
General

A Topology of Search Concepts

Vegard Sandvold has an interesting post entitled “Help Me Design a Topology of Search Concepts” in which he visualizes assorted search approaches in a two-dimensional space, the two dimensions being the degree of information accessibility and whether the approach is algorithm-powered or user-powered.

His four quadrants:

  • Low information accessibility + algorithm-powered = simple search (e.g., keyword search)
  • Low information accessibility + user-powered = superficial search (e.g., collaborative filtering)
  • High information accessibility + algorithm-powered = ingenious search (e.g., question answering)
  • High information accessibility + user-powered = diligent search (e.g., faceted search)

I’m not sure how I feel about the quadrant names (though I like how my employer and I are champions of diligence!), but I do like this attempt to lay out different approaches to supporting information seeking, and I like his choice of axes.

More importantly, I hope this analysis helps advance our ability as technologists to match solutions to information seeking problems. Many of us have an intuitive sense of how to do so, but I rarely see principled arguments–particularly from vendors who may be reluctant to forgo any use case that could translate into revenue.

Of course, it would be nice to quantify these axes, or at least to formalize them a bit more rigorously. For example, how do we measure the amount of user input into the process–particuarly for applications that may involve human input at both indexing and query time? Or how do we measure information accessibility in a corpus that might include junk (e.g., spam)?

Still, this is a nice start as a framework, and I’d be delighted to see it evolve into a tool that helps people make technology decisions.

By Daniel Tunkelang

High-Class Consultant.

10 replies on “A Topology of Search Concepts”

What’s the difference, really, between user-powered and algorithm-powered? Ideally, I want information retrieval systems that are strongly algorithm-powered in that they reveal concepts and linkages and information that I would and could not have discovered on my own, but that give me, the user, direct manipulative control over which of those pieces are important to me. There should really be a symbiosis between algorithm and user. Or is that what he already means by user-powered?

Like

Hi Daniel,
Thanks for the comment and valuable feedback. I’m glad you accept the champion title 🙂

I’m a big fan of organizing ideas, and I hope to make this start of a framework more rigid. The axes definitely need more work. Quantifying and formalizing would be good, and any pointers to relevant literature would be very helpful.

Cheers!

Like

I’ve seen a number of these graphs over the years to try to separate vertical vs. web search, deep vs. shallow content, etc. All of them fail in that they tend to show user features that are currently in vogue in the quadrants. Users want to find information: they don’t care about the tools (except in that they’re easy) or the source (expect that it’s available, authoratative and as close to free as possible). To quote ArnoldIT: Users don’t want to search, they want to find.

Like

Jeremy, I think Vegard should answer that himself, but that it’s a matter of relative contribution. I’d put traditional Boolean search at the extreme where the user does everything and the system does nothing, black-box best-first search at the other extreme (where the system does almost everything), and HCIR approaches in the middle.

Mark, your point is well taken, but I think you’re mixing up audiences. Users may be content to be oblivious of technology, but application developers–and perhaps business more generally–can’t afford to be. Keeping users happy requires understanding what solutions will best meet their needs. And, to Vegard’s credit, the approaches he describes are not faddish features, but established approaches to information seeking.

Like

@jeremy
My initial idea was to have the horizontal axis represent to what extent human or machine intelligence is primarily responsible for resolving the users information need. In my head that would place black-boxed best-first systems to the extreme left (where the user does little but supplying the initial query), while HCIR would be closer to the middle/right. Here the algorithm still does something (like retrieving a set of relevant candidates), but it’s largely up to the user to increase the precision.

I think it’s difficult to find examples of purely user-powered systems. Twitter Search is my best guess. Would that be similar to traditional Boolean search, like mentioned by Daniel?

@Mark Johnson
Daniel is right. Users need things that just work, but we who are making these things need to understand why they work (or not). I hope you find it useful. Cheers!

Like

@jeremy
I think systems with a good symbiosis between algorithms and users could be centered on the horizontal axis. But where would that place systems based on collaborative filtering? Lots of user feedback, but little direct control. Hm…

Like

Thanks for the feedback so far, guys! It’s really helpful. I’ll be the first to admit that my visualization is simplified and with a few short-comings, but leaving things out is part of the art of concept modeling.

To flip it around, let me try to explain the story of each quadrant, and hopefully that will give you some ideas for how the axes could be different.

Diligent Search can make user responsible both for making latent structures of the data more apparent, and for iteratively expressing their information needs. That would include both faceted search and systems like Freebase, where users are the main driving force.

Ingenious Search would try to achieve greater precision with algorithms for data mining, clustering and NLP. That would include systems like Wolfram Alpha, Powerset and Grokker.

Superficial Search does not aim to reach deep into the information space, making direct manipulative control less important. Instead group behavior is used to surface popular and recent content. Twitter and Amazon are great examples.

Simple Search is all you may need to navigational queries, and other examples of directed search. The user may know exactly what he’s looking for, he knows the document exists, and he knows how to formulate his query to find it. Like Google back in 1998.

Like

This is wonderful stuff, I love the diagram too! For a blog post this is great. As far as serious research goes, yes it needs to be more thorough 🙂

Like

Comments are closed.