The Noisy Channel

 

Questions. But Why?

August 1st, 2010 · 21 Comments · General

Yahoo! Answers and Answers.com have been around since 2005. But community question answering (as distinct from question answering using natural language processing) has witnessed a resurgence of popularity–at least in the blogosphere and among investors. Quora and Hunch are two of hottest startups on the web, and Aardvark was acquired by Google earlier this year. Most recently, Ask.com relaunched with a return to its question-answering roots and Facebook began rolling out Facebook Questions.

So there’s no question that community question answering is hot. The question is why? In particular, is community question answering a step forward or backward relative to today’s search engines, or is it something different?

Regarding Facebook Questions, Jason Kincaid writes in TechCrunch:

Given its size, it won’t take long for Facebook to build up a massive amount of data — if that data is consistently reliable, Questions could turn into a viable alternative to Google for many queries.

That’s a big if.  But I think the bigger caveat is the vague quantifier “many”. The success of community question answering services will depend on how these services position themselves relative to users’ information needs. Anyone arguing that these services can or should replace today’s web search engines might want to consider the following examples of information needs that are typical of current search engine use:

I hope I don’t have to keep going to convince you that web search engines have earned their popularity by serving a broad class of information needs (i.e., answer lots of questions)–and that’s without even using the wide variety of personalized and social features that web search engines are rapidly developing.

The common thread in the above questions is that they focus on objective information. In general, such questions are effectively and efficiently answered by search engines based on indexed, published content (including “deep web” content made available to search engines via APIs). There’s a lot of work we can do to improve search engines, particularly in the area of supporting query formulation. But it seems silly and wasteful to route such questions to other people–human beings should not be reduced to performing tasks at which machines excel.

That said, I agree with Kincaid that there are many information needs that are well addressed by  community question answering. In particular:

  • Questions for which point of view is a feature, not a bug. Review sites succeed when they provide sincere, informed personal reactions to products and services. Similarly, routing questions to people makes sense either when we care about the answerer’s a point of view. For some questions, I want the opinion of someone who shares my taste (which is what Hunch is pursuing with its “taste graph“). For others, I want a diversity of expert opinions–for which I might turn to Aardvark (which tries to route questions to topic experts), Quora (where people follow particular topics), or LinkedIn Answers. Over time, the answers to many such questions can be published and indexed–and indeed some answers sites receive a large share of their traffic from search engines.
  • Niche topics. As much as web search as improved information accessibility for the “long tail” of published information, the effectiveness of web search can be highly variable for the most obscure information needs. Moreover, this effectiveness depends significantly on the user: some people are better at searching than others, especially in their areas of domain expertise. Social search can help level the playing field. Much as Wikipedia has surfaced much of the expertise at the head of the information distribution, community question answering can help out in the tail.
  • Community for its own sake. Even in cases where search engines are more effective and efficient than community question answering services, some people prefer to participate in a social exchange rather than to conduct a transaction with an impersonal algorithm. Indeed, researchers at Aardvark found that many of the questions posed through their service (pre-acquisition) could be answered successfully using Google. I’ll go out on a limb and assume that Aardvark’s users were early technology adopters who are quite conversant with search engines–but in some case chose to use a social alternative simply because they wanted to be social.

Conclusions? Community question answering may be overhyped right now, but it isn’t a fad. There are broad classes of subjective information needs that require a point of view, if not a diversity of views. And even if much of the use of community question answering sites is mediated by search engines indexing their archives, there will always be a need for fresh content. I also believe that social search will continue to be valuable for niche topics, since neither search engines nor searchers will ever be perfect.

But I think the biggest open question is whether people will favor community question answering simply to be social. I conjecture that, by very publicly integrating community question answering into is social networking platform, Facebook is testing the hypothesis that it can turn information seeking from a utilitarian individual task into an entertaining social destination. Given Facebook’s highly engaged user population, we won’t have to wait long to find out.

21 responses so far ↓

  • 1 Jon // Aug 1, 2010 at 4:46 pm

    Another interesting site revolving around questions is OpenStudy.com. They fill the niche of academic questions and collaboration.

  • 2 Daniel Tunkelang // Aug 1, 2010 at 4:53 pm

    Jon, thanks for swinging by. But you might have better luck marketing your site here if you opened it up to readers. :-)

  • 3 Suzan Verberne // Aug 2, 2010 at 6:08 am

    Thank you for this interesting post!

    After having written my PhD thesis on why-QA from an NLP (with IR) point of view, I recently decided to write a paper on how to bring why-QA, which is considered a complex NLP problem by the NLP community, to web search.

    The importance of social web QA is large in addition to more traditional sources such as Wikipedia. In my thesis I found that, of 700 why-questions from the Webclopedia set, only 186 had an answer in the English Wikipedia in 2006. In 2009, we were able to find another quarter of the remaining 514 questions in Wikipedia.

    Many of the questions that cannot be answered using Wikipedia only, are the categories that you mention. For why-questions, both subjectivity and the long tail of niche topics are very important!

    In the paper that I am preparing, I hope to find out what can be achieved for why-QA when *light-weight* NLP is added to the output of a web search engine.

  • 4 Daniel Tunkelang // Aug 2, 2010 at 9:18 am

    Suzan, thanks for sharing your research. I love Wikipedia, but indeed by its own charter it only covers “notable” information. Web search aims for somewhat higher recall, but why-questions in the tail are certainly a mixed bag, even when they are objective.

    I should clarify that web search can answer subjective questions–to the extent that the subjective answers have been published and indexed. But I think that community question answering systems have an edge here, especially if they are integrated into users’ social networks and foster conversation.

  • 5 jeremy // Aug 2, 2010 at 3:07 pm

    I guess the bigger question is: Which kind of searches are more monetizable?

    If people start using social QA systems for questions that they would have never clicked an ad, anyway, then that’s one thing. However, if ad clicks suffer as social QA rises in popularity, that’s something else.

  • 6 Daniel Tunkelang // Aug 2, 2010 at 4:34 pm

    Oh, I think that social / community QA systems may offer lots of opportunity for monetization. Many searches for subjective information are precursors to spending money (e.g., what is the best *) and if users are opening themselves up to persuasion from friends and strangers, they might even be receptive to advertisers. It remains to be seen how advertisers will fit into user experience, but I’m sure that the market will work it out.

  • 7 Maria Droujkova // Aug 4, 2010 at 10:54 am

    Social Q&A systems are “the next generation of email group.” They can provide means for a small niche community to aggregate knowledge in ways much lighter than a wiki, while tracking and making visible individual contributions.

    We just installed OSQA for Natural Math and gave it a trial with a few people. I am loving it. We will be using it much more this Fall for community math projects – Math Clubs, baby math, early algebra and so on.

    http://ask.naturalmath.com/

  • 8 Lecia Kaslofsky // Aug 4, 2010 at 6:47 pm

    Hi Daniel-

    Hope you’ve been well. For “information needs that are well addressed by community question answering,” I’d add questions with complicated concepts but simple words. Questions like: What banks in New York will exchange African currencies? Or What insect bites in San Francisco can cause red bumps on the skin?

    These are examples of objective, non-niche searches that broad search engines sometimes can’t handle and for which the searcher can’t make the keywords much better.

  • 9 Daniel Tunkelang // Aug 4, 2010 at 8:58 pm

    Maria, great to see you guys using OSQA that way. I just learned about MetaOptimize, which is using it for a machine learning community. But I suspect that these sites will mostly be destinations for small, tight communities. The only exception I’ve seen is Stack Overflow, which is a top-1000 site according to Alexa. And of course we’ll see where Quora goes (I’ll have to save Quora, Quo Vadis as a title for a future post!).

    Lecia, great to hear from you! You raise an interesting point: those questions certainly challenge search engines today. But I’m curious — do you think the answer is to route them to other people who are better (and perhaps more domain-informed) searchers? Or for search engines to improve the information-seeking experience for users that come in with those needs?

  • 10 Lecia Kaslofsky // Aug 5, 2010 at 3:19 pm

    I think it could go either way. Personally I think this problem should be solved algorithmically / creatively but hasn’t been fully addressed yet — hence my suggestion to include in needs addressed via community question answering.

    As for search engines: I’m interested in what spezify.com is trying to do — giving you a visual map of your search results — although it doesn’t quite work as well as it could (the human brain can analyze dozens of pictures of “african + currencies” related topics and quickly pick which is the most applicable).

    My main concern with people/expert-driven search results is 1) the wait and 2) the strong possibility of error. I used Aardvark in a bar to ask how many ounces an old-fashioned martini glass held (surprisingly a question a Google search couldn’t answer) and I got back three very different responses hours later. Not helpful!

    Best, Lecia

  • 11 Daniel Tunkelang // Aug 6, 2010 at 9:51 am

    Yeah, Spezify is a bit too visual for my eyes, but I suppose it works for some.

    And I agree that people-driven search is slow and variable. That’s why I shudder when I hear people suggesting it could replace search engines–feels like a big step backward.

  • 12 Lecia Kaslofsky // Aug 6, 2010 at 1:50 pm

    Couldn’t agree with you more!

    The one exception is Quora, which has such a high-quality community for start-ups/inet questions. I understand that dynamic (high-quality, topic-specific community) is what makes Stack Overflow continue to work. But that’s not what Facebook Questions and the other “answer” sites seem to be aiming for. Hm, we shall see…

  • 13 Daniel Tunkelang // Aug 6, 2010 at 2:07 pm

    Indeed, Quora and Stack Overflow have fantastic user communities. But those really are communities. I’m sure they would barf on noob users who came in and started asking questions that were easily answered using Google. Indeed, many of the interesting questions on Quora involve unpublished information (e.g., rumors, undisclosed acquisition terms) and the answers in some cases have propagated from Quora to the web at large.

  • 14 Biweekly Links – 08-09-2010 « God, Your Book Is Great !! // Aug 9, 2010 at 7:41 pm

    [...] Questions. But Why? An interesting post about the resurgence of websites with community question [...]

  • 15 Lecia Kaslofsky // Aug 19, 2010 at 4:38 pm

    Have you heard of / tried Swingly.com?

    Yet another Q/A search engine but this is NLP. Quite the trend!

  • 16 Daniel Tunkelang // Aug 19, 2010 at 4:47 pm

    I requested an invite code for the beta after reading about it on Search Engine Land but never heard back from them. Am certainly curious to see how it works.

  • 17 Lecia Kaslofsky // Aug 19, 2010 at 6:41 pm

    Me too! We’ll see…

  • 18 Nic Fulton // Aug 24, 2010 at 11:13 pm

    I’d imagine time degrades these databases quite quickly. What % of questions/answers will likely be wrong in a year? Sports results, Oscar winners, celebrities (who’s married to whom) etc. – from the surface this feels like the majority of the web ;-) When doing Google queries for hacks, fixes etc. it’s quite common to end up on a forum or FAQ. But it’s even more common that the higher ranked results are older (more inbound links?). So I’ve taken to the habit of adding “2010” to queries in the hope that the forum or FAQ or whatever has timestamps… and I get recent postings.

  • 19 Daniel Tunkelang // Aug 25, 2010 at 2:11 am

    That’s an interesting point. I wonder what fraction of the information in CQA systems is time-sensitive. I suspect the more common case is that time degrades the utility, rather than the accuracy, of the answers–at least in for questions relating to current topics and events. Yesterday’s news wraps today’s fish.

  • 20 Rob Gonzalez // Aug 27, 2010 at 8:00 am

    I’ve been using ask.metafilter.com for years now. The community is really thoughtful and provides good answers to everything you can imagine. Need a Dr. recommendation in the Houston area? Trying to figure out a great way to surprise your girlfriend at the airport (one of mine)? Can’t find a reasonably priced moose head to hang on your wall (another one of mine)?

    I check it a few times a week, respond to a question at least once a week, and post my own about once every month or two, and have found it really valuable. I really think it’s the community, though, that makes it special. Much in the same way that Reddit.com seems to have a tight nit group of folks (especially on the sub-reddits) as compared to Digg, which is more mass media, I think ask.metafilter is just better than Yahoo! answers for more obscure questions.

  • 21 Daniel Tunkelang // Aug 27, 2010 at 9:46 am

    Sounds like that same reason Quora works. Two questions:

    1) How much does the $5 registration fee help / hurt the site’s utility? I imagine that it keeps out spammers / marketers, but that it also discourages casual passers-by who might altruistically offer help. Hard to imagine Wikipedia charging $5 to would-be editors.

    2) Do people use the archive? That is, do people actual search prior questions before asking new ones? One of my biggest concerns with CQA sites is that they can be extremely inefficient if users are unable or unwilling to do so.

Clicky Web Analytics