The Noisy Channel

 

Go Shopping, Be Social

October 14th, 2009 · 10 Comments · Uncategorized

Aardvark

If you’re into search startups, then today’s a great day to check out what a couple of them are up to.

TheFind just launched (or relaunched?) a “buying engine” that aspires “to help every shopper find exactly what they want to buy, and to help every merchant, large and small, to reach those shoppers.” It has some nice interface elements, but I can’t say I’m sold on the overall user experience.

Meanwhile, Aardvark just launched a web-based version of its social search application. The site urges users to “ask any question in plain English, and Aardvark will discover the perfect person in your network to answer…in under 5 minutes!” As I’ve commented before, I think they need to embrace the philosophy of “when in doubt, make it public“. But hey, they made the Time’s top 50 websites for 2009, so perhaps they are right to ignore my advice.

10 responses so far ↓

  • 1 Ethan Bauley // Oct 14, 2009 at 4:58 pm

    Daniel, viz the name of this blog (and your point about public-ness), one of Aardvark’s stated goals is to focus attention/reduce noise…check out this post from their CTO

    http://blog.vark.com/?p=201

  • 2 Daniel Tunkelang // Oct 14, 2009 at 5:03 pm

    I don’t have a problem with routing questions to get them answered efficiently. I do think, however, that the answers themselves should be public. It’s not like keeping the process inside your network gives you any useful expectation of privacy–and the lack of a public–and searchable–answers database destroys much of the value of the service, at least in my opinion.

  • 3 Ethan Bauley // Oct 14, 2009 at 5:14 pm

    They also talked about this at length in the video interview Mike Arrington did a few weeks ago, here’s the germane passage:

    VOICE 1: And you have to be careful about becoming radically different things because lots of people are asking you to do that. So, you know, the two big things for us have always been point systems and sharing content like making all other content explicitly public and…

    INTERVIEWER: You haven’t done that yet?

    VOICE 1: We haven’t done that.

    INTERVIEWER: That’s something (unintelligible).

    VOICE 1: Well, it’s something that people are constantly coming to us saying like hey, do this and then it would really be great. And it’s – for us, you know, fairly different from the brand. We’ve always positioned Aardvark to be more like a communication channel where, you know, I’ll ask you something over e-mail. I’m asking you over e-mail, you know, it’ll be a little bit weird if A, my question and your answer showed up publicly on the web.

    INTERVIEWER: Yeah.

    VOICE 1: And B, if Gmail sent me back, you know, similar conversations that other people have had in Gmail.

    INTERVIEWER: Yeah.

    VOICE 1: That said, there’s certainly a place for it, there’s a way that you can selectively share the content, there’s a way that you can give people sort of personalized…

    INTERVIEWER: Maybe de-personalized it a little bit, too.

    VOICE 1: Potentially.

    INTERVIEWER: Yeah.

    VOICE 1: I mean, there is certainly a subset of the content, which is less personal or there certainly might be a scale. It’s just, you know, we haven’t found the really clear way to keep doing what we want to do well and yet, address sort of that comment that the users have come to us. What we keep in the back of our mind…

    INTERVIEWER: Make it public, searchable, index.

    VOICE 1: That’s right.

    INTERVIEWER: Yes. So, when do that happen?

    VOICE 1: We just don’t know yet.

    INTERVIEWER: Because you haven’t quite figured how to do it?

    VOICE 2: Well actually, we want to quite end up that way. I mean, we try to figure out what’s – if our users suggest a feature to us, say what’s the urge behind that? Sometimes the urge is – well, other sites do it this way. And that’s not sufficient for us to do it. Sometimes there’s an urge like, well I want to learn the terminology(ph) of the communities.

    INTERVIEWER: Sort of, you want Aardvark to start answering questions directly based on the database. It’s all this artificial intelligence stuff that you’re so good at.

    VOICE 1: Now, it’s exact. Now, we want humans to answer the questions.

    VOICE 2: We want Aardvark to be better at routing them. So, their database…

    INTERVIEWER: Just better at routing them.

    VOICE 2: In the database we used to figure out who’s best in answering and seeing over time what happens.

    INTERVIEWER: But let’s say that I asked for a really good restaurant recommendation in San Francisco. And it turns out that a month before, one of my best friends asked the same thing on Aardvark and got stellar answers that he marked just like they’re perfect, perfect, perfect. Well, you turn back and say, hey, this is Aardvark, we think that actually you might like these restaurants. I mean, you must be thinking about that as an AI guy, right?

    VOICE 2: Well, there are two ways to look at it. I’m an recovering AI guy. What that means is, I no longer think that my machine is intelligent as any human in terms of understanding what the other human really wants. When I ask a question, I want you to understand me. It’s the same reason when you go to a store and there’s something that really helps you and listens to what you need. So, in this situation that you suggest, Aardvark will do two things. One, it will go to your friend and say, hey, you know, we think you can answer Michael’s question about restaurants. You see, just received these answers, and we’ll go to these answers. There is a different channel where on the (unintelligible) angle we’re saying, hey, here’s an opportunity to…

    INTERVIEWER: Bypass actual results…

    VOICE 2: …to bypass (unintelligible) answers we have clearly labeled, when we think that’s useful for our users. But, right now, our users are getting a really good experience without any recycled comment.

    INTERVIEWER: But this is not – I’m sorry, I just finished what you – all four seasons of Battlestar Galactica, the sort of stuff someone likes, have you guys seen that, the new Battlestar, yeah? But, I mean if you think about it, you could just be having all these users sort of training the big Aardvark mind right now towards eventually being able to answer things intelligently.

    VOICE 2: It’s one way of looking at it.

    INTERVIEWER: You just refused. You don’t think that’s the future.

    VOICE 2: I don’t think that’s the future. In some ways, they’re trying to figure out our mind to learn how to tap them better. The same way that the web isn’t, you know, the same as human intelligence, it has a different sort of knowledge

    http://www.techcrunch.com/2009/07/07/you-put-your-aardvark-in-my-twitter-bonus-interview-with-founders/

  • 4 Daniel Tunkelang // Oct 14, 2009 at 5:29 pm

    Well, they have a view point. I’m clearly more aligned with Arrington on this. And, to be clear, this doesn’t have to be about AI. Yahoo Answers isn’t AI–it’s human-answered questions, stored in public so they’ll be searchable. You’d think that in this eco-friendly, cost-conscious age, they’d embrace recycling!

    Look at this blog as an example. Our exchange is personal, but benefits from being publicly accessible. I wouldn’t put this much effort into an email exchange with a casual acquaintance. Would you? Look at the failure of personal–or even corporate–knowledge management systems as compared to the uber-public Wikipedia.

    To be clear, there is a role for private communication, where the benefit of privacy justifies the cost of inefficient communication. Indeed, experts charge for this privilege. But Aardvark offers neither privacy nor publication. I think it’s a worst of both worlds–and I push the issue because it seems easy to fix.

  • 5 jeremy // Oct 14, 2009 at 5:47 pm

    I think the issue is one of whether the recycled answer comes from the same underlying information need as the current user’s question.

    Topics drift, needs drift, time and places change.

  • 6 Ethan Bauley // Oct 14, 2009 at 5:52 pm

    It’s interesting to me because I agree with you on a intuitive basis. However there are of course:

    - practical considerations (e.g. product roadmap)
    - strategic questions (marginal value to Vark of public display when they have a product unique enough to create a bottleneck they can leverage later…viz Marshall Kirkpatrick’s post on LinkedIn today: even though I agree openness/public is strategically inevitable, it’s useful to consider when that inevitability occurs. It’s certainly not today, tomorrow, or yesterday, or last year for LinkedIn).
    - X factor of the “unknown unknowable” vision/innovative potential of the team (I have briefly consulted with them early last year and my instinct is to give them the benefit of the doubt by a wiiiiiiide margin ;-)

    At any rate, my main point is that I think it’s useful to question the “o yeah it always has to be public all the time from the get go”, not least because that is my initial reaction, too.

    Related, HP is a client of mine and Labs has a super interesting project called Taxonom…you can likely deduce what it addresses ;-)

    http://www.docstoc.com/docs/6298443/In-depth-description-of-the-HP-Labs-Taxonomcom-cloud-service

    Worth a peek

  • 7 Daniel Tunkelang // Oct 14, 2009 at 8:22 pm

    Well, the team is smart, so I hope they know what they’re doing. I’m not saying anything here that I haven’t told Max already. But he and his colleagues are the ones with skin in the game–I’m just a kibitzer dispensing free advice.

    I’m familiar with Taxonom–I met Pankaj at a conference earlier this year. Interesting presentation, but there’s a bit more detail in this paper:

    http://knoesis.wright.edu/students/topher/publications/Growing_fields_of_interest.pdf

  • 8 Lee // Oct 15, 2009 at 3:23 pm

    thefind.com people say they got 500.000 web sites crawled. They extracted product name, description, price etc. from 500.000 different HTML-structured sites. But how do they seperate the data (product name/price etc.) of the main product from related/upsell products’ data or other text on the page and how do they do this type of (automated) extraction? I can’t think of any way other than describing/formulizing each of the 500.000 sites’ html structure and then crawling the site.

    Can someone please help me on learning more about this type of (automated) extraction? Any paper, guide or site is appreciated.
    Thanks

  • 9 Daniel Tunkelang // Oct 15, 2009 at 3:34 pm

    Lee, I’m not sure who’s published in this area, but a lot of folks have worked on it. You might start by looking at David Huynh’s Sifter project:

    http://simile.mit.edu/wiki/Sifter

  • 10 Powering Social Search through Personal APIs | Taylor Davidson // Nov 16, 2009 at 12:54 pm

    [...] open to better ideas. ** Related discussion between Ethan Bauley and Daniel Tunkelang about social shopping and Aardvark. *** Also related, Personal APIs: Better Living Through [...]

Clicky Web Analytics