The Noisy Channel

 

Exploring Explortatory Search

November 18th, 2009 · 13 Comments · General

Google’s recently released Image Swirl is slick. But I’ve been struggling to figure out whether it’s useful or simply a showcase for cool technology.

And that’s prompted me to think about the overloaded term “exploratory search“. A while back, I tried to define exploratory search based on what it is not. This time, let me aim to positively characterize what I see as its two primary use cases:

  1. I know what I want, but I don’t know how to describe it.
  2. I don’t know what I want, but I hope to figure it out once I see what’s out there.

The first use case cries out for tools that support query refinement or elaboration. Existing tools span a range from suggesting spelling corrections (aka “did you mean”) to offering semantically or statistically related searches that hopefully provide the user with at least a step in the right direction. One of my favorite approaches, faceted search, is primarily used to support query refinement through progressive narrowing of an initial search query.

The second “I don’t know what I want” use case is fuzzier. In the language of machine learning, this use case is unsupervised, while the previous one is supervised. In general, it’s a lot harder to define or evaluate outcomes for unsupervised scenarios. Indeed, Hal Daume has argued that we should only do unsupervised learning if we do not have a trustworthy automatic evaluation metric. That’s a strong position, and you can see some of the counterarguments in his comment thread. But, going back to our scenario, it’s really hard to judge the effectiveness of tools like similarity browsing when they support exploration in the absence of any concrete goal.

With that in mind, I’ll reserve judgment on the utility of tools like Image Swirl. To the extent that it aims at the first use case, clustering images for a particular search, I’m ambivalent. I’d prefer a more transparent interface, in which I have more of a sense of control over the navigational experience. I suspect it is more aimed at the second use case, offering a compact visualization of what is out there.

Besides, as some folks have brought up at the HCIR workshops, it’s important that we make information seeking fun. And Swirl certainly scores on that front.

13 responses so far ↓

  • 1 Fritz Knabe // Nov 18, 2009 at 10:49 am

    One of the key challenges with mechanisms for exploratory search is for the user to be able to easily construct a model for what the search interface is doing. The user should be able to guess why the tools or choices are presented, be able to distinguish between the choices, and, when a choice is made, see results consistent with the model. This is one of the great strengths of faceted search: if there are a number of choices for “brand,” then you can be quite confident that selecting “Sony” will take you to only Sony products.

    All too often, the realities of data can work against this. On many catalog sites supporting query refinement by color, for example, the stock images available are not necessarily in the chosen color. So, refining to “red” products will often show pictures of black ones, but the fact that they are available in red not immediately apparent (worse, of course , is when the reference to “red” is only incidental in their description).

    Image Swirl looks like it has a way to go to get over the threshold of user comprehension. My sample queries produced many very similar pictures. What was the difference between one group or another? I couldn’t tell, so I couldn’t understand how the mechanism worked. I couldn’t decide how to move forward, and I wasn’t sure what would be different from clicking one image group versus another.

    To make this work, it really needs to be apparent to the user what the choices are. Is it clear that all these pictures are together because they’re night scenes? Or because it’s the same building? Or because there’s a particular face? Etc.

    That said, what’s being attempted here is an immensely hard problem, and I applaud the effort.

  • 2 dinesh vadhia // Nov 18, 2009 at 11:05 am

    A collegue sent the link to Image Swirl this morning and after a quick play had the same reaction ie. what is the purpose of it.

    Maybe someone knows but isn’t this image searching by (cleaned up) image labels/tags.

    For example, if you click on ‘cup’ on the far right list ie. http://image-swirl.googlelabs.com/html?query=cup# you get http://image-swirl.googlelabs.com/html?query=cup#.

    Two types of cups are being differentiated – cups for drinking and cups as in sports trophies which is good but it doesn’t seem satisfactory. A case of the cup hasn’t runneth over!

  • 3 Greg Linden // Nov 18, 2009 at 9:15 pm

    I have the same reaction of Dinesh and his colleague. This is snazzy, but serves no apparent purpose.

    If the purpose is exploratory, then the criteria for the clusters needs to be intuitive and obvious. In my usage, clusters usually had nearly identical images, occasionally visually similar images, and often apparently unrelated images. Links between clusters seemed useful and understandable only a slim majority of the time.

    So, sure, a fun toy, but not more than a toy until the relevance is much higher and the clusters either intuitive or explained.

  • 4 Daniel Tunkelang // Nov 18, 2009 at 9:24 pm

    Indeed, I’d like to see a more transparent interface, like that of Artist Rising. But I concede that the available data for images on the open web makes such a textual interface a formidable challenge. I’ll certainly be the first to cheer any steps toward overcoming it.

  • 5 jeremy // Nov 19, 2009 at 1:05 am

    One of the key challenges with mechanisms for exploratory search is for the user to be able to easily construct a model for what the search interface is doing.

    what is the purpose of it.

    So, sure, a fun toy, but not more than a toy until the relevance is much higher and the clusters either intuitive or explained.

    I agree, exploratory search requires explanatory search.

    BTW, Greg, to tie this back to an earlier discussion: That’s one of the ongoing questions I have about using A/B testing in a web environment. Suppose you launch a new feature, algorithm, or interaction design. You split your users into buckets, and see if there is enough uptake on the new bit. However, if the users don’t have a clear model of what it is you’re trying to do for them, how the tool is meant to work and when and why, how do you really know that they aren’t just using the new tool (or interaction mechanism or whatever) to do things in their old way, the way in which they have become accustomed? Especially since whenever I’ve been bucket tested (and I think I have noticed a couple of times when I’ve been a test subject) the new bit has never come with any sort of explanation as to what is going on. It’s just suddenly different.

    I suppose that there is nothing intrinsic to A/B testing that says a new system can’t come with a bit of text, overlay, whatever, that tells the user a little bit about it, how it should be used, what it’s trying to do. Nevertheless, it rarely if ever does include such things.

    Why is that?

  • 6 jeremy // Nov 19, 2009 at 1:52 am

    Daniel, have you seen this article? Would you say your model above encompass the entire set of activities that Marchionini lists as “exploratory” (see Figure 1)?

    http://www.ischool.utexas.edu/~i385t-sw/readings/Marchionini-2006-Exploratory_Search.pdf

  • 7 Daniel Tunkelang // Nov 19, 2009 at 9:17 am

    I’ve read the article, and it’s certainly close. What I want to emphasize is the difference between the two kinds of uncertainty associated with exploratory search: uncertainty of how to express a certain intent vs. uncertain intent. I’m not sure that everyone in the field agrees on this being the critical consideration, and I certainly don’t think interface designers always ask themselves which of the use cases I describe the are aiming to support. Particularly in the first use case, exploration is a means, not an end.

  • 8 dinesh vadhia // Nov 19, 2009 at 10:38 am

    @Fritz, “That said, what’s being attempted here is an immensely hard problem, and I applaud the effort.”

    @Daniel, “But I concede that the available data for images on the open web makes such a textual interface a formidable challenge. I’ll certainly be the first to cheer any steps toward overcoming it.”

    Grrrrr! Hopefully by the tail-end of the year. Just too busy writing other search services to get the new image-search demo online using flickr images.

  • 9 jeremy // Nov 19, 2009 at 1:05 pm

    That said, what’s being attempted here is an immensely hard problem

    Is exploratory search like NP?

    http://irgupf.com/2009/11/19/google-is-to-exploratory-search-as-p-is-to-np/

  • 10 Daniel Tunkelang // Nov 19, 2009 at 4:40 pm

    It’s an interesting analogy. But at least for P vs. NP the problems are well defined–what we don’t know is how to characterize the solution space. Here, we are confronted with a subset of the exploratory search space where, at least in my view, we’re not sure how to define the problem–at least in a way that we can evaluate solutions.

  • 11 jeremy // Nov 19, 2009 at 5:57 pm

    Oh, sure. I’m not saying that information seeking can (or even should) be defined with the same mathematical rigor. The objective function in IR, “relevance”, is ultimately messy and inconsistent in a way that formal mathematics isn’t. The analogy does break down at a certain point.

    Another thought: Does it matter to your model how much information is required to satisfy the info need? E.g. does it matter whether the user is looking for a single, existing piece of information vs. a set (synthesis, summary, contrast-ment) of information?

    Is that issue completely orthogonal to your two dimensions, or is it related?

  • 12 Daniel Tunkelang // Nov 20, 2009 at 9:11 am

    Well, if someone ever proves that P = NP, then I’m willing to reconsider everything else I know!

    As for the amount of information associated with the information need, I’d say the issue is orthogonal in theory, but not in practice. In theory, looking for sets (or set-based analyses) is just like looking for documents, at least with respect to the simple exploratory search dichotomy / spectrum I’m describing. In practice, however, looking for a set tends to be more complicated than looking for a single document or information snippet. In particular, you almost never know if you’ve found everything.

  • 13 Weekly Search & Social Coverage: 11/24/09 | Search Engine Journal // Nov 24, 2009 at 12:13 pm

    [...] Exploring Exploratory Search – Noisy Channel [...]

Clicky Web Analytics