Last week, I had the pleasure of talking with CMU professor George Loewenstein, one of the top researchers in the area of behavioral economics. I mentioned my idea of using prediction markets to address the weaknesses of online review systems and reputation systems, and he offered two insightful pointers.
The first pointer was to the notion of pluralistic ignorance. As summarized on Wikipedia:
In social psychology, pluralistic ignorance, a term coined by Daniel Katz and Floyd H. Allport in 1931, describes “a situation where a majority of group members privately reject a norm, but assume (incorrectly) that most others accept it…It is, in Krech and Crutchfield’s (1948, pp. 388–89) words, the situation where ‘no one believes, but everyone thinks that everyone believes'”. This, in turn, provides support for a norm that may be, in fact, disliked by most people.
It had not occurred to me that pluralistic ignorance could wreak havoc on the prediction market approach I proposed. Specifically, there is a risk that, even though the majority participants in the market hold a particular opinion, they suppress their individual opinions and instead vote based on mistaken assumptions about the collective opinion of others. Ironically, these participants are pursuing an optimal strategy, given their pluralistic ignorance. Yet the results of such a market would not necessarily reflect the true collective opinion of participants. Clearly there is a need to incorporate people’s true opinions into the equation, and not just their beliefs about others’ opinions.
Which leads me to the second resource to which Loewenstein pointed me: a paper by fellow behavioral economist and MIT professor Drazen Prelec entitled “A Bayesian Truth Serum for Subjective Data“. As per the abstract:
Subjective judgments, an essential information source for science and policy, are problematic because there are no public criteria for assessing judgmental truthfulness. I present a scoring method for eliciting truthful subjective data in situations where objective truth is unknowable. The method assigns high scores not to the most common answers but to the answers that are more common than collectively predicted, with predictions drawn from the same population. This simple adjustment in the scoring criterion removes all bias in favor of consensus: Truthful answers maximize expected score even for respondents who believe that their answer represents a minority view.
Most of the paper is devoted to proving, subject to a few assumptions, that the optimal strategy for players in this game is to tell what they believe to be the truth–that is, the truth-telling strategy is the optimal Bayesian Nash equilibrium for all players.
The assumptions are as follows:
- The sample of respondents is sufficiently large that a single answer cannot appreciably affect the overall results.
- Respondents believe that others sharing their opinion will draw the same inferences about population frequencies.
- All players assume that other players are responding truthfully–which follows if they are rational players.
Prelec sums up his results as follows:
In the absence of reality checks, it is tempting to grant special status to the prevailing consensus. The benefit of explicit scoring is precisely to counteract informal pressures to agree (or perhaps to “stand out” and disagree). Indeed, the mere existence of a truth-inducing scoring system provides methodological reassurance for social science, showing that subjective data can, if needed, be elicited by means of a process that is neither faith-based (“all answers are equally good”) nor biased against the exceptional view.
Unfortunately, I don’t think that Prelec’s assumptions hold for most online review systems and reputation systems. In typical applications (e.g., product and service reviews on sites like Amazon and Yelp), the input is too sparse to even approximate the first assumption, and the other two assumptions probably ascribe too much rationality to the participants.
Still, Bayesian truth serum is a step in the right direction, and perhaps the approach (or some simple variant of it) applies to a useful subset of real-world prediction scenarios. Certainly it gives me hope that we’ll succeed in the quest to mine “subjective truth” from crowds.