General

Exploring Nuggetize

I’ve been exchanging emails with Dhiti co-founder Bharath Mohan about Nuggetize, an intriguing interface that surfaces “nuggets” from a site to reduce the user’s cost of exploring a document collection. Specifically Nuggetize targets research scenarios where users are likely to assemble a substantial reading list before diving into it. You can try Nuggetize on the general web or on a particular site that has been “nuggetized”, e.g., a blog like this one or Chris Dixon’s.

I’m always happy to see people building systems that explicitly support exploratory search (and am looking forward to seeing the HCIR Challenge entries in a week!). Regular readers may recall my coverage of Cuil, Kosmix, and Duck Duck Go. And of course I helped build a few of my own at Endeca. So what’s special about Nuggetize?

Mohan describes it as a faceted search interface for the web. I’ll quibble here–the interface offers grouped refinement options, but the groups don’t really strike me as facets. Moreover, the interface isn’t really designed to explore intersections of the refinement options–rather, at any given time, you see the intersection of the initial search and a currently selected refinement. But it is certainly an interface that supports query refinement and exploration.

The more interesting features are the nuggets and the support for relevance feedback.

The nuggets are full sentences, and thus feel quite different from conventional search-engine snippets. Conventional snippets serve primarily to provide information scent, helping users quickly determine the utility of a search result without the cost of clicking through to it and reading it. In contrast the nuggets are document fragments that are sufficiently self-contained to communicate a coherent thought. The experience suggests passage retrieval rather than document retrieval.

The relevance feedback is explicit: users can thumbs-up or thumbs-down results. After supplying feedback, users can refresh their results (which re-ranks them) and are also presented with suggested categories to use for feedback (both positive and negative). Unfortunately, the research on relevance feedback tells us that, helpful as it could be to improving user experience, users don’t bite. But perhaps users in research scenarios will give it a chance–especially with the added expressiveness and transparency of combining document and category feedback.

Overall it is a slick interface, and it’s nice seeing the various ideas Mohan and his colleagues put together. There’s certainly room for improvement–particularly in the quality of the categories, which sometimes feel like victims of polysemy. Open-domain information extraction is hard! Some would even call it a grand challenge.

Mohan reads this blog (he reached out to me a few months ago via a comment), and I’m sure he’d be happy to answer questions here.

By Daniel Tunkelang

High-Class Consultant.

View Archive

23 replies on “Exploring Nuggetize”

Thanks Daniel, for a very fair review. We are very eager to read feedback on our system.

A few suggested uses:

1) Try “Get nuggets for a feed”, and drop a url, like Paul Graham’s latest on Yahoo! [http://www.paulgraham.com/yahoo.html]

2) Try “Get nuggets for a topic”, and drop a topic, like [Faceted Navigation]. A query like [HCIR] or [Jaguar] is a good candidate to test out our relevance feedback.

3) Try “Get nuggets for a feed”, and drop an RSS feed you follow often, [http://feeds.feedburner.com/venturebeat]

The categories get better with more documents. The nuggets are decent starting from 1 document in your collection.

Just in case you plan to “Publish” the nuggets you create, you can use a token “betaone” when prompted.

Bharath

Share this:

Related

By Daniel Tunkelang

23 replies on “Exploring Nuggetize”