The Noisy Channel

 

Catching Up With Hunch

May 11th, 2009 · 5 Comments · General

Last week, I stopped by the Hunch office to learn more about what they’re doing, as well as to contribute my own thoughts about socially enhanced decision making. I consider Hunch, like Aardvark, to be an example of social search, but I recognize that I use the term in a broad sense. Perhaps, as Jeremy suggests, it’s better to think of social search and collaborative search being different aspects of multi-person search.

In any case, Hunch is doing some interesting things. Their mission, roughly speaking, is to become a Wikipedia for decision making. They are inspired by human computation success stories like 20Q.net and presumably the ESP Game. Their general approach is to learn about people by asking them multiple-choice questions that help cluster them demographically (“Teach Hunch About You”), and then to create customized decision trees to help people find their own answers to questions. The questions themselves are crowd-sourced from users (though now they are vetted first in a “workshop”).

They’re learning as they go along. For example, they’ve recognized that it’s important to distinguish between objective questions (e.g., concerning the price of a product) and questions of taste (e.g., what is art?). They’re also experimenting with interface tweaks, including giving users more control over what information their algorithms use to rank potential answers, and allowing users to short-circuit the decision tree at any time by skipping to the end.

Perhaps of particular interest to readers here, they’ve made an API available, which you can also play with in a widget on their blog.

As I told my friend at Hunch, I’m still skeptical about decision trees. Maybe I’m a bit too biased toward faceted search, but I don’t like having such a rigid decision making process. Apparently they’re not wedded to decision trees, but they are understandably concerned about creating a richer interface that might turn off orĀ  intimidates ordinary users. I can’t deny that decision trees are simple to use, and I can’t argue with their 77% success rate.

Still, the rigidity of a decision tree leaves me a bit cold. Even if it leads me to the right choice, it doesn’t give me the necessary faith in that choice. Transparency helps, and I like that you can click on “Why did Hunch pick this?” to see what in your question-specific or personal profile led Hunch to recommend that answer. But I’d like more freedom and less hand-holding.

I still have a handful of invites; let me know if you’re interested. As usual, first come, first serve.

5 responses so far ↓

  • 1 Christopher // May 11, 2009 at 10:36 pm

    I was very skeptical (still am) as I don’t think Decision Trees are the best mechanism for this BUT I’m thinking like a researcher/scientist not an average consumer who is their target (I assume) or is their target for this incarnation of Hunch.

    Now when I found out who their Chief Scientist is I changed my tune a bit, he has done some great stuff like using common sense to improve aspects of NL parsing while at MIT (I am highly impressed with the concept).

    The other reason I am now quite interested is the thing you nailed, the availability of an an API (hello Wolfram!). While as a web site Hunch is not overly exciting to me taking (them or a 3rd party) their data into the deeper NLP realm can allow some very interesting things can be built; I think.

  • 2 Bob Carpenter // May 12, 2009 at 1:39 pm

    The main problem I have as a consumer with both of these technologies is recall. Even highly faceted sites like newegg.com make decisions about which facets to present to the user in what order, just like a decision tree.

    If I go to newegg.com, for instance, and type “2gb memory”, it by default pops me into “Guided Search”. But even if I go to “Advanced Search”, I don’t get any of the subcategories I’d expect, like the type of memory (pin config, DDR2/DDR3), the speed of the memory, etc. I can expand subcategories, useful links (?), price, and manufacturer. I have to make several choices, limiting my choice of products, until I see the “type” and “speed” and “cas latency” facets. Someone (or some algorithm) decided that manufacturer was more important than the type of memory! And I can’t click on the breadcrumb path at the top of my search to just limit myself to DDR2 and forget the other facets I’ve chosen.

    The second issue is plain old database errors. With faceting at sites like NewEgg and Amazon, the DB annotation isn’t perfect, so you get false positives and false negatives on faceted search that you tend not to get with plain old text search. If I select DDR2, I have to trust that they’ve entered “DDR2” for every product that really is DDR2. Amazon also has serious issues faceting by recording artist (granted, it’s a hard problem), but it means that I can’t click on an artist’s name and expect to see all their other albums. The faceting’s actually hurting recall.

    Of course, we could hope that they do a better job on the database, selecting facets, etc., but it’s a really difficult problem when there are literally hundreds or thousands of facets in the system and humans doing data entry.

  • 3 Daniel Tunkelang // May 12, 2009 at 9:25 pm

    I think most of these sites put more emphasis on precision than recall–it’s very hard to make the case for the latter, not that I haven’t tried. My colleagues and I are working on some approaches that focus on recall in the face of noisy annotation, exposing a number of records vs. topic drift trade-off (which is an unsupervised analog of recall vs. precision). Demo in progress!

  • 4 jeremy // May 13, 2009 at 1:19 am

    How are you measuring topic drift? Through non-relevance alone, or by some other metric?

  • 5 Daniel Tunkelang // May 13, 2009 at 7:15 pm

    Looking at a few measures, but the one that I’m most interested in is based on relative entropy between the starting set and expanded set, using distributions on explicit or extracted document annotations. We also can use lexical databases like WordNet, but I (and my fellow Endecans generally) much prefer to work with statistical techniques than linguistic ones. And we don’t expect the measure to be right all the time, so we’re also looking at ways to give users not only fine-grained control of the expansion, but also previews of how expanding the query changes the results. After all, I am an HCIR zealot!

    This is work in progress–I’ll say and show more when I can!

Clicky Web Analytics