Categories
General

Clarification before Refinement on Amazon

I just noticed today that a search on Amazon (e.g., this search for algorithms) does not provide the options to sort the results or to refine by anything other than category. Once you do select a category (e.g., books), you are given additional refinement options, as well as the ability to sort.

While I find this interface less than ideal (e.g. even if all of your search are in a single category, it still makes you select that category explicitly), I do commend them for recognizing the need to have users clarify before they refine. The implication–one we’ve been pursuing at Endeca–is that it is incumbent on the system to detect when its understanding of the user’s intent is ambiguous enough to require a clarification dialogue.

Categories
General

Back from ISSS Workshop

My apologies for the sparsity of posts lately; it’s been a busy week!

I just came back from the Information Seeking Support Systems Workshop, which was sponsored by the National Science Foundation and hosted at the University of North Carolina – Chapel Hill. An excerpt from the workshop home page nicely summarizes its purpose:

The general goal of the workshop will be to coalesce a research agenda that stimulates progress toward better systems that support information seeking. More specifically, the workshop will aim to identify the most promising research directions for three aspects of information seeking: theory, development, and evaluation.

We are still working on writing up a report that summarizes the workshop’s findings, so I don’t want to steal its thunder. But what I can say is that participants shared a common goal of identifying driving problems and solution frameworks that would rally information seeking researchers much the way that TREC has rallied the information retrieval community.

One of the assignments we received at the workshop was to pick a problem we would “go to the mat” for. I’d like to share mine here to get some early feedback:

We need to raise the status of evaluation procedures where recall trumps precision as a success metric. Specifically, we need to consider scenarios where the information being sought is existential in nature, i.e., the information seeker wants to know if an information object exists. In such cases, the measures should combine correctness of the outcome, user confidence in the outcome, and efficiency.

I’ll let folks know as more information is released from the workshop.

Categories
General

What is (not) Exploratory Search?

One of the recurring topics at The Noisy Channel is exploratory search. Indeed, one of our readers recently took the initiative to upgrade the Wikipedia entry on exploratory search.

In the information retrieval literature. exploratory search comes across as a niche topic consigned to specialty workshops. A cursory reading of papers from the major information retrieval conferences would lead one to believe that most search problems boil down to improving relevance ranking, albeit with different techniques for different problems (e.g., expert search vs. document search) or domains (e.g., blogs vs. news).

But it’s not just the research community that has neglected exploratory search. When most non-academics think of search, they think of Google with its search box and ranked list of results. The interaction design of web search is anything but exploratory. To the extent that people engage in exploratory search on the web, they tend to do so in spite of, rather than because of, the tools at their disposal.

Should we conclude then that exploratory search is, in fact, a fringe use case?

According to Ryen White, Gary Marchionini, and Gheorghe Muresan:

Exploratory search can be used to describe an information-seeking problem context that is open-ended, persistent, and multi-faceted; and to describe information-seeking processes that are opportunistic, iterative, and multi-tactical. In the first sense, exploratory search is commonly used in scientific discovery, learning, and decision making contexts. In the second sense, exploratory tactics are used in all manner of information seeking and reflect seeker preferences and experience as much as the goal (Marchionini, 2006).

If we accept this dichotomy, then the first sense of exploratory search is a niche use case, while the second sense characterizes almost everything we call search. Perhaps it is more useful to ask what is not exploratory search.

Let me offer the following characterization of non-exploratory search:

  • You know exactly what you want.
  • You know exactly how to ask for it.
  • You expect a search query to yield one of two responses:
    – Success: you are presented with the object of your search.
    – Failure: you learn that the object of your search is unavailable.

If any of these assumptions fails to hold, then the search problem is, to some extent, exploratory.

There are real non-exploratory search needs, such as navigational queries on the web and title searches in digital libraries. But these are, for most purposes, solved problems. Most of the open problems in information retrieval, at least in my view, apply to exploratory search scenarios. It would be nice to see more solutions that explicitly support the process of exploration.

Categories
General

Enterprise Search Done Right

A recent study from AIIM (the Association for Information and Image Management, also known as the Enterprise Content Management Association) reports that enterprise search frustrates and disappoints users. Specifically, 49% of survey respondents “agreed” or “strongly agreed” that it is a difficult and time consuming process to find the information they need to do their job.

Given that I work for a leading enterprise search provider, you might think I’d find these results disconcerting, even if the report points the blame at clients rather than vendors:

But fault does not lie with technology solution providers. Most organizations have failed to take a strategic approach to enterprise search. 49% of respondents have “No Formal Goal” for enterprise Findability within their organizations, and a large subset of the overall research population state that when it comes to the “Criticality of Findability to their Organization’s Business Goals and Success”, 38% have no idea (“Don’t Know”) what the importance of Findability is in comparison to a mere 10% who claim Findability is “Imperative” to their organization.

As I’ve blogged here before, there is no free lunch, and organizations can’t expect to simply plug a search engine into their architectures as if it were an air freshener. But that doesn’t let Endeca or anyone else off the hook. It is incumbent on enterprise search providers, including Endeca, both to set expectations around how it is incumbent on enterprise workers to help shape the solution by supplying their proprietary knowledge and information needs, and to make this process as painless as possible.

Enterprise search, done right, is a serious investment. But it is also an investment that can offer extraordinary returns in productivity and general happiness. Enterprises need to better appreciate the value, but enterprise search providers need to better communicate the process of creating it.

Categories
Uncategorized

Information Retrieval Systems, 1896 – 1966

My colleague and Endeca co-founder Pete Bell just pointed me to a great post by Kevin Kelly about what may be the earliest implementation of a faceted navigation system. Like every good Endecan, I’m familiar with Ranganathan‘s struggle to sell the library world on colon classification. But it is still striking to see this struggle played out through technology artifacts from a pre-Internet world.

Categories
General

A Game to Evaluate Browsing Interfaces?

I’ve mused a fair amount about to apply the concept of the Phetch human computation game to evaluate browsing-based information retrieval interfaces. I’d love to be able to better evaluate faceted navigation and clustering approaches, relative to conventional search as well as relative to one another.

Here is the sort of co-operative game I have in mind. It uses shopping as a scenario, and has two roles: the Shopper and the Shopping Assistant.

As a Shopper, you are presented with an shopping list and a browsing interface (i.e., you can click on links but you cannot type free text into a search box). Your goal is to find as many of the items on your shopping list as possible within a fixed time limit. In a variation of this game, not all of the items on the list are findable.

As a Shopping Assistant, you know the complete inventory, but not what the Shopper is looking for. Your goal is to help the Shopper find as many of the items on his or her shopping list as possible within a fixed time limit. On each round of interaction, you present the Shopper with information and links within the constraints of a fixed-size page. The links may include options to select items (the Shopper’s ultimate goal), as well as options that show more items or modify the query.

Either role could be played by a human or a machine, and, like Phetch, the game could be made competitive by having multiple players in the same role. I’d think the interesting way to implement such a game would be with human Shoppers and algorithmic Shopping Assistants.

Is anyone aware of research along these lines? I’m hardly wed to the shopping list metaphor–it could be some other task that seems suitable for browsing-oriented interfaces.

Categories
Uncategorized

Max Wilson’s Blog

Max Wilson, a colleague of mine at the University of Southampton who has contributed frequently to the conversation here at the Noisy Channel, just started a blog of his own. Check out Max’s blog here.

His post on exhibiting exploratory behaviour (that’s the Queen’s English to you!) raises an issue at the heart of many of our discussions on this blog: what is exploratory behavior? Is it clarification or refinement? Are users exploring in order to resolve imperfect communication with the information retrieval system, or are they exploring in order to learn?

These are burning questions, and I look forward to learning more about how Max, m.c. schraefel, and others are addressing them.

Categories
General

How Google Measures Search Quality

Thanks to Jon Elsas for calling my attention to a great post at Datawocky today on how Google measures search quality, written by Anand Rajaraman based on his conversation with Google Director of Research Peter Norvig.

The executive summary: rather than relying on click-through data to judge quality, Google employs armies of raters who manually rate search results for randomly selected queries using different ranking algorithms. These manual ratings drive the evaluation and evolution of Google’s ranking algorithms.

I’m intrigued that Google seems to wholeheartedly embrace the Cranfield paradigm. Of course, they don’t publicize their evaluation measures, so perhaps they’re optimizing something more interesting than mean average precision.

More questions for Amit. 🙂

Categories
Uncategorized

Seeking Opinions about Information Seeking

In a couple of weeks, I’ll be participating in an invitational workshop sponsored by the National Science Foundation on Information Seeking Support Systems at the University of North Carolina – Chapel Hill. The participants are an impressive bunch–I feel like I’m the only person attending whom I’ve never heard of!

So, what I’d love to know is what concerns readers here would like me to raise. If you’ve been reading this blog at all, then you know I have no lack of opinions on research directions for information seeking support systems. But I’d appreciate the chance to aggregate ideas from the readership here, and I’ll try my best to make sure they surface at the workshop.

I encourage you to use the comment section to foster discussion, but of course feel free to email me privately (dt at endeca dot com) if you prefer.

Categories
Uncategorized

Exploratory search is relevant too!

After seeing what the Noisy channel readership has done to improve the HCIR and Relevance Wikipedia entries, I was thinking we might take on one or two more. Specifically, the Exploratory Search and Exploratory Search Systems entries are, quite frankly, in sad shape.

Between the readership here, the folks involved in HCIR ’08, and the participants in the IS3 workshop, I would think we have more than enough expertise in exploratory search to fix these up.

Any volunteers? For those of you who are doing research in exploratory search, consider that those two Wikipedia pages are the top hits returned when people search for exploratory search on Google.