Month: September 2008

Migrating Soon

Just another reminder that I expect to migrate this blog to a hosted WordPress platform in the next days. If you have opinions about hosting platforms, please let me know by commenting here. Right now, I’m debating between DreamHost and GoDaddy, but I’m very open to suggestions.

I will do everything in my power to minimize disruption–not sure how easy Blogger will make it to redirect users to the new site. I’ll probably post here for a while after to the move to try to direct traffic.

I do expect the new site to be under a domain name I’ve already reserved: http://thenoisychannel.com. It currently forwards to Blogger.

Uncategorized

Back from the Endeca Government Summit

Post author By Daniel Tunkelang
Post date September 6, 2008
2 Comments on Back from the Endeca Government Summit

I spent Thursday at the Endeca Government Summit, where I had the privilege to chat face-to-face with some Noisy Channel readers. Mostly, I was there to learn more about the sorts of information seeking problems people are facing in the public sector in general, and in the intelligence agencies in particular.

While I can’t go into much detail, the key concern was exploration of information availability. This problem is the antithesis of known-item search: rather than you are trying to retrieve information you know exist (and which you know how to specify), you are trying to determine if there is information available that would help you with a particular task.

Despite being lost in a sea of TLAs, I came away with a deepened appreciation of both the problems the intelligence agencies are trying to address and the relevance of exploratory search approaches to those problems.

General

Query Elaboration as a Dialogue

Post author By Daniel Tunkelang
Post date September 3, 2008
7 Comments on Query Elaboration as a Dialogue

I ended my post on transparency in information retrieval with a teaser: if users aren’t great at composing queries for set retrieval, which I argue is more transparent than ranked retrieval, then how will we ever deliver an information retrieval system that offers both usefulness and transparency?

The answer is that the system needs to help the user elaborate the query. Specifically, the process of composing a query should be a dialogue between the user and the system that allows the user to progressively articulate and explore an information need.

Those of you who have been reading this blog for a while or who are familiar with what I do at Endeca shouldn’t be surprised to see dialogue as the punch line. But I want to emphasize that the dialogue I’m describing isn’t just a back-and-forth between the user and the system. After all, there are query suggestion mechanisms that operate in the context of ranked retrieval algorithms–algorithms which do not offer the user transparency. While such mechanisms sometimes work, they risk doing more harm than good. Any interactive approach requires the user to do more work; if this added work does not result in added effectiveness, users will be frustrated.

That is why the dialogue has to be based on a transparent retrieval model–one where the system responds to queries in a way that is intuitive to users. Then, as users navigate in query space, transparency ensures that they can make informed choices about query refinement and thus make progress. I’m partial to set retrieval models, though I’m open to probabilistic ones.

But of course we’ve just shifted the problem. How do we decide what query refinements to offer to a user in order to support this progressive refinement process? Stay tuned…

Uncategorized

Migrating to WordPress

Just a quick note to let folks know that I’ll be migrating to WordPress in the next days. I’ll make every effort to have to move be seamless. I have secured the domain name http://thenoisychannel.com, which currently forwards Blogger, but will shift to wherever the blog is hosted. I apologize in advance for any disruption.

Uncategorized

E-Discovery and Transparency

Post author By Daniel Tunkelang
Post date September 1, 2008
2 Comments on E-Discovery and Transparency

One change I’m thinking of making to this blog is to introduce “quick bites” as a way of mentioning interesting sites or articles I’ve come across without going into deep analysis. Here’s a first one to give you a flavor of the concept. Let me know what you think.

I just read an article on how courts will tolerate search inaccuracies in e-Discovery by way of Curt Monash. It reminded me of our recent discussion of transparency in information retrieval. I agree that “explanations of [search] algorithms are of questionable value” for convincing a court of the relevance and accuracy of the results. But that’s because those algorithms aren’t sufficiently intuitive for those explanations to be meaningful except in a theoretical sense to an information retreival researcher.

I realize that user-entered Boolean queries (the traditional approach to e-Discovery) aren’t effective because users aren’t great at composing queries for set retrieval. But that’s why machines need to help users with query elaboration–a topic for an upcoming post.

Uncategorized

POLL: Blogging Platform

I’ve gotten a fair amount of feedback suggesting that I switch blogging platforms. Since I’d plan to make such changes infrequently, I’d like to get input from readers before doing so, especially since migration may have hiccups.

I’ve just posted a poll on the home page to ask if folks here have a preference as to which blogging platform I use. Please vote this week, and feel free to post comments here.