Categories
General

A Question of User Expectations

Ideally, a search engine would read the user’s mind. Shy of that, a search engine should provide the user with an efficient process for expressing an information need and then provide the user with results relevant to the that need.

From an information scientist’s perspective, these are two distinct problems to solve in the information seeking process: establishing the user’s information need (query elaboration) and retrieving relevant information (information retrieval).

When open-domain search engines (i.e., web search engines) went mainstream in the late 1990s, they did so by glossing over the problem of query elaboration and focusing almost entirely on information retrieval. More precisely, they addressed the query elaboration problem by requiring users to provide reasonable queries and search engines to infer information needs from those queries. In recent years, there has been more explicit support for query elaboration–most notably in the form of type-ahead query suggestions (e.g., Google Instant). There have also been a variety of efforts to offer related queries as refinements.

But even with such support, query elaboration typically yields an informal, free-text string. All vocabularies have their flaws, but search engines compound the inherent imprecision of language by not even trying to guide users to a common standard. At best, query suggestion nudges users towards more popular–and hopefully more effective–queries.

In contrast, consider closed-domain search engines that operate on curated collections, e.g., the catalog search for an ecommerce site. These search engines often provide users with the opportunity to express precise queries, e.g., black digital cameras for under $250. Moreover, well-designed sites offer users faceted search interfaces that support progressive query elaboration through guided refinements.

Many (though not all) closed-domain search engines have an advantage over their open-domain counterparts: they can rely on manually curated metadata. The scale and heterogeneity of the open web defies human curation. Perhaps we’ll reach a point when automatic information extraction offers quality competitive with curation, but we’re not there yet. Indeed, the lack of good, automatically generated metadata has been cited as the top challenge facing those who would implement faceted search for the open web.

What can we do in the mean time? Here is a simple idea: use a closed-domain search engine do guide users to precise queries, and then apply the resulting queries to the open web. In other words mash up the closed and open collections.

Of course, this is easier said that done. It is not at all clear if or how we can apply a query like “black digital cameras for under $250” to a collection that is not annotated with the necessary metadata. But we can certainly try. And our ability to perform information retrieval from structured queries will improve over time–in fact, it may even improve more quickly if we can start to assume that users are being guided to precise, unambiguous queries.

Even though result quality would be variable, such an approach would at least eliminate a source of uncertainty in the information seeking process: the user would be certain of having a query that accurately represented his or her information need. That is no small victory!

I fear, however, that users might not respond positively to such an interface. Given the certainty that a query accurately represents his or her information need, a user is likely to have higher expectations of result quality than without that certainty. Retrieval errors are harder to forgive when the query elaboration process eliminates almost any chance of misunderstanding. Even if the results were more accurate, they might not be accurate enough to satisfy user expectations.

As an HCIR evangelist, I am saddened by this prospect. Reducing uncertainty in any part of the information seeking process seems like it should always be a good thing for the user. I’m curious to hear what folks here think of this idea.

Categories
General

Pluralistic Ignorance and Bayesian Truth Serum

Last week, I had the pleasure of talking with CMU professor George Loewenstein, one of the top researchers in the area of behavioral economics. I mentioned my idea of using prediction markets to address the weaknesses of online review systems and reputation systems, and he offered two insightful pointers.

The first pointer was to the notion of pluralistic ignorance. As summarized on Wikipedia:

In social psychology, pluralistic ignorance, a term coined by Daniel Katz and Floyd H. Allport in 1931, describes “a situation where a majority of group members privately reject a norm, but assume (incorrectly) that most others accept it…It is, in Krech and Crutchfield’s (1948, pp. 388–89) words, the situation where ‘no one believes, but everyone thinks that everyone believes'”. This, in turn, provides support for a norm that may be, in fact, disliked by most people.

It had not occurred to me that pluralistic ignorance could wreak havoc on the prediction market approach I proposed. Specifically, there is a risk that, even though the majority participants in the market hold a particular opinion, they suppress their individual opinions and instead vote based on mistaken assumptions about the collective opinion of others. Ironically, these participants are pursuing an optimal strategy, given their pluralistic ignorance. Yet the results of such a market would not necessarily reflect the true collective opinion of participants. Clearly there is a need to incorporate people’s true opinions into the equation, and not just their beliefs about others’ opinions.

Which leads me to the second resource to which Loewenstein pointed me: a paper by fellow behavioral economist and MIT professor Drazen Prelec entitled “A Bayesian Truth Serum for Subjective Data“. As per the abstract:

Subjective judgments, an essential information source for science and policy, are problematic because there are no public criteria for assessing judgmental truthfulness. I present a scoring method for eliciting truthful subjective data in situations where objective truth is unknowable. The method assigns high scores not to the most common answers but to the answers that are more common than collectively predicted, with predictions drawn from the same population. This simple adjustment in the scoring criterion removes all bias in favor of consensus: Truthful answers maximize expected score even for respondents who believe that their answer represents a minority view.

Most of the paper is devoted to proving, subject to a few assumptions, that the optimal strategy for players in this game is to tell what they believe to be the truth–that is, the truth-telling strategy is the optimal Bayesian Nash equilibrium for all players.

The assumptions are as follows:

  1. The sample of respondents is sufficiently large that a single answer cannot appreciably affect the overall results.
  2. Respondents believe that others sharing their opinion will draw the same inferences about population frequencies.
  3. All players assume that other players are responding truthfully–which follows if they are rational players.

Prelec sums up his results as follows:

In the absence of reality checks, it is tempting to grant special status to the prevailing consensus. The benefit of explicit scoring is precisely to counteract informal pressures to agree (or perhaps to “stand out” and disagree). Indeed, the mere existence of a truth-inducing scoring system provides methodological reassurance for social science, showing that subjective data can, if needed, be elicited by means of a process that is neither faith-based (“all answers are equally good”) nor biased against the exceptional view.

Unfortunately, I don’t think that Prelec’s assumptions hold for most online review systems and reputation systems. In typical applications (e.g., product and service reviews on sites like Amazon and Yelp), the input is too sparse to even approximate the first assumption, and the other two assumptions probably ascribe too much rationality to the participants.

Still, Bayesian truth serum is a step in the right direction, and perhaps the approach (or some simple variant of it) applies to a useful subset of real-world prediction scenarios. Certainly it gives me hope that we’ll succeed in the quest to mine “subjective truth” from crowds.

Categories
General

LinkedIn Signal = Exploratory Search for Twitter

I like Twitter. Yes, I know that a lot of its content is noise. But I’ve found Twitter to be a useful professional tool for both publishing and consuming information. Publishing to Twitter is the easy part: I publish links to my blog posts and occasionally engage in public conversations.

Consuming information from Twitter is more of a challenge. I follow 100 people, which is about the limit of my attention budget. I use saved searches to track long-term interests (much as I use web and news alerts), and I perform ad hoc searches when I am interested in finding out what people are saying about a particular topic.

But Twitter search is not a great fit for analysis or exploration–unless you count trending topics as analysis. Originally, the search results were simply the tweets that contained the matching tweets in order of recency. The current system sometimes promotes a few “top tweets” to the top of the results. Still, if you’d like to get a summary view, slice and dice the results, or perform any other sort of HCIR task, you’re out of luck.

Until now.

The LinkedIn Search, Network, and Analytics team–the same folks that built LinkedIn’s faceted search system and developed open-source search tools Zoie and Bobo–just introduced a service called Signal that is squarely aimed at folks like me who use Twitter as a professional tool. It is still in its infancy (in private beta, in fact), but I think it has the potential to dramatically change how people like me use Twitter. You can learn more about its architecture and implementation details here.

Signal joins the often cacophonous Twitter stream to the high-quality structured data that LinkedIn knows about its own users. For example, when I post a tweet, LinkedIn knows that I am in the software industry, work at Google, and live in New York. LinkedIn can only make this connection for people who include Twitter ids in their LinkedIn profiles, but that’s a substantial and growing population.

Signal then lets you use this structured information to satisfy analytic and exploratory information needs. For example, I can see which companies’ employees are tweeting about software patents (top two are Google and Red Hat).

Or compare what Microsoft employees are saying about Android…

…to what Google employees are saying about Android.

As you can see on the right-hand side, Signal also mines shared links to identify popular ones relative to given search–and allows you to see who has shared a particular link. This functionality is similar to Topsy, but with the advantage of allowing structured searches. Like Topsy, it wrangles the mass of retweeted links into a useful and user-friendly summary.

Signal is still very much in beta. An amusing bug that I encountered earlier today was that, due to some legacy issues in how Linkedin standardized institution names, the system decided that I was an alumnus of the Longy School of Music rather than of MIT. Fortunately, that’s fixed now (thanks, John!)–I love karaoke, but I’m not ready to quit my day job!

Also, Signal only exposes a handful of LinkedIn’s facets, which limits the breadth of analysis and exploration. I’d love to see it add a past company facet, making it possible to drill down into what a company’s ex-employees are saying about a particular topic (e.g., their ex-employer).

Finally, while Signal offers Twitter hashtags as a facet, these are hardly a substitute for a topic facet. In order to provide such a facet, LinkedIn needs to implement some kind of concept extraction to provide a useful topic facet (something I’d also love to see for their regular people search). This is a challenging information extraction problem, especially for the open web, but I also know from experience that it is tractable within a domain. Given LinkedIn’s professional focus, I believe this is a problem they can and should tackle.

Of course, Linkedin also needs to convince more of its users to join their LinkedIn accounts to their Twitter accounts–since that is their input source. But I suspect it’s mostly a matter of time and education–and hopefully the buzz around Signal will help raise awareness.

All in all, I see LinkedIn Signal as a great innovation and a big step forward for exploratory search and for Twitter. Congratulations to John Wang, Igor Perisic, and the rest of the LinkedIn search team on the launch!