The Noisy Channel

 

An Information Cascade

November 17th, 2010 · 15 Comments · General

I’ve been reading Networks, Crowds, and Markets, a great textbook by David Easley and Jon Kleinberg. I’m very grateful to Cambridge University Press for surprising me with an unsolicited review copy. I’m more than halfway through its 700+ pages. Much of the material is familiar in this “interdisciplinary look at economics, sociology, computing and information science, and applied mathematics to understand networks and behavior”. But I’m delighted by much that is new to me, including a particularly elegant description of an information cascade.

I excerpt the following example from section 16.2, which the authors in turn  borrow from Lisa Anderson and Charles Holt:

The experimenter puts an urn at the front of the room with three marbles in it; she announces that there is a 50% chance that the urn contains two red marbles and one blue marble, and  a 50% chance that the urn contains two blue marbles and one red marble…one by one, each student comes to the front of the room and draws a marble from the urn; he looks at the color and then places it back in the urn without showing it to the rest of the class. The student then guesses whether the urn is majority-red or majority-blue and publicly announces this guess to the class.

Let’s simulate how a set of rational students would perform in this experiment.

The first student has it easy: if he selects a blue marble, he guesses blue; if he selects a red marble, he guesses red. Either way, his guess publicly discloses the first marble’s color.

Thus the second student knows exactly the colors of the first two selected marbles. If he selects the same color as the first student, he will make the same guess.  If, however, the second student selects a red marble, he has no reason to prefer one color over the other. Let’s assume that, when the odds are 50/50, an indifferent student breaks symmetry by selecting the color in his hand. That way, we guarantee that the second student discloses the color of the marble he selects.

Things get interesting with the third student’s selection. What happens if the first two students have both guessed red, but the third student selects a blue marble? Rationally, the third student will guess red, since he knows that two of the first three selected marbles were red. In fact, if the first two students select red marbles, *every* subsequent student will ignore his own selection and guess red. Of course, analogous reasoning applies if we reverse the colors.

Generalizing from this case, we can see that the sequence guesses locks in on a single color as soon as two consecutive students agree. I leave it as an exercise to the reader to determine that, if the urn is majority-red, there is a 16/21 probability that the sequence will converge to red and a 5/21 probability that it will converge to blue.

A 5/21 probability of arriving at the wrong answer may not seem so bad. But imagine if you could see the actual marbles sampled and not just the guesses (i.e., each student provides an independent signal). The law of large numbers kicks in quickly, and the probability of the sample majority color being different from the true majority converges to 0.

This example of an information cascade is unrealistically simple, but is eerily suggestive of the way many sequential decision processes work. I hope we all see it as a cautionary tale. The wisdom of the crowd breaks down when we throw away the independent signals of its participants.

15 responses so far ↓

  • 1 Neal // Nov 17, 2010 at 6:25 pm

    This post about info cascades also reminded me of the Monty Hall Problem – and the way people find it difficult to handle probability when others aren’t involved!

    http://en.wikipedia.org/wiki/Monty_Hall_problem

  • 2 Daniel Tunkelang // Nov 17, 2010 at 6:56 pm

    I love the Monty Hall problem.

    I’ve also come up with what I call the Vanna White varaiant (the Wikipedia entry calls it “Monty Fall” or “Ignorant Monty” and offers citations). In my rendition, Monty Hall calls in sick, so Vanna takes over. Unfortunately, she doesn’t actually know which door has the car, so she may accidentally reveal it when she opens a door. In that case, they cancel the game and show a re-run instead.

    In this variant, there is no advantage to switching.

    But yes, it’s amazing how many people cannot accept the 2/3 answer for the original problem.

  • 3 Panos Ipeirotis // Nov 22, 2010 at 12:07 am

    Lovely example.

    Btw, I have been wondering often how much we need independence for the wisdom of the crowds.

    The example that you provided illustrates clearly that lack of independence can be bad.

    However, there are cases where lack of independence is fine: Consider prediction markets. There is a very clear lack of independence (you can see the aggregate opinion of everyone else) but the markets end up performing great.

    There are even examples where you actually *need* people to reveal personal information (again, lack of independence) in order for the crowd to arrive to the correct outcome. With complete independence nobody can arrive to the correct outcome.

    There seems to be a need for a balance between spreading information and influencing others, which I still cannot grasp. Maybe it is in the book?

  • 4 Daniel Tunkelang // Nov 23, 2010 at 1:43 am

    Panos, thank you — and thanks for raising a really interesting question. I don’t know that the book answers it directly (wait until I finish reading it!), but I certainly see trade-offs in real life.

    For example, I’ve seen interviews take place with complete independence among interviewers vs. with communication between successive interviewers. The former avoid order bias, but the latter may be more efficient at extracting signal.

  • 5 jeremy // Nov 29, 2010 at 12:02 pm

    To beat one (of my many :-) usual drums.. one of the main reasons we started developing explicitly collaborative exploratory search was in reaction to the hype over the independent collectivism of “wisdom of crowds”. See:

    http://irgupf.com/2009/04/21/dagstuhl-seminar-on-content-based-retrieval/

    Wisdom of crowds is essentially defined as independent opinions of participants. That’s one of Suroweicki’s 4 core characteristics:

    http://en.wikipedia.org/wiki/Wisdom_of_Crowds#Four_elements_required_to_form_a_wise_crowd

    Back in 2006, I believed, as I still do now, that if you instead explicitly create dependence between the actors, you’ll get a better (or at least qualitatively different) result. That’s collaborative search. And, just like you mention in comment #4, with the communication between successive interviewers, one of the advantages of explicit collaboration is that it is much more efficient (and effective) at extracting search signal.

  • 6 jeremy // Nov 29, 2010 at 12:03 pm

    ..and what one calls “bias”, another might call “personalization”.

  • 7 Daniel Tunkelang // Nov 30, 2010 at 11:13 pm

    Fair points. Panos also wrote a nice post riffing on this one:

    http://behind-the-enemy-lines.blogspot.com/2010/11/wisdom-of-crowds-when-do-we-need.html

  • 8 Panos Ipeirotis // Nov 30, 2010 at 11:46 pm

    In the HCOMP2010 paper “Exploring Iterative and Parallel Human Computation Processes” http://glittle.org/Papers/HCOMP2010.pdf, Greg Little et al., have studied the idea of parallelism vs iteration (essentially to “build on each other, or to generate independently”).

    From their (empirical) results, independence has higher variance. This means higher error rate, but also higher chances of generating something better. Groups tend to generate more consistent outcomes, but at the expense of decreasing the extremes (both on top and on bottom).

  • 9 jeremy // Dec 1, 2010 at 10:25 am

    @Panos: I suppose it would depend on the task, too. The HCOMP paper you mention explores writing, brainstorming, and transcription. I’m an IR guy, and am mostly interested in recall-oriented tasks. Some of my experiments have found that people find more (and more unique) relevant documents when building on each other, as opposed to working alone in parallel. Even after you take into account duplication of effort.

    What I like about the iteration/building on each other approaches is that there is a wider variety of available ways in which that iteration can be structured, i.e. what my coauthor Gene Golovchinsky would call “roles” in the collaboratively iterative process, and in the underlying algorithms to support those roles.

    Take a look at Colum Foley’s dissertation:

    http://www.computing.dcu.ie/~cfoley/cfoley-PhD_thesis.pdf

    In his work, he distinguishes two different types of relevance feedback that are possible under an iterative team regime: “collaborative” relevance feedback, in which the algorithm tries to mutually amplify and fuse together the signal coming from each participant, and “complementary” relevance feedback, in which the algorithm tries to use the relevance signal coming from each member of the group to push those members further apart from each other, but in new, still potentially relevant directions.

    What I am trying to say is that I don’t necessarily think that iteration/non-independence has to have lower variation. I think it depends on the underlying algorithm that you use to mediate that non-independent collaboration.

    And what I like about collab search is that you have that freedom to explore various approaches. With people working independently, each basically has to follow the same path, each has to be optimized in the same way. But with iteration/collaboration, you can give each member a different way of optimizing, and thus lead to solutions that simply are not possible with independent work.

  • 10 Panos Ipeirotis // Dec 1, 2010 at 10:56 am

    For me, collaborative search to be *is* wisdom of the crowds. But less-so compared to Pagerank-style and clickthrough-based approaches. And more-so compared to everyone interacting independently with an IR system without influencing each other at all.

    For IR, I suspect that the collaborative approach will results in people finding easier more documents, but at the end most documents will be about a specific aspect of the query, which is reinforced by the early results. However, you will lack the diversity that you would have if you had let people search alone and independently. The fact that you need an algorithm to “push apart” the different groups illustrates that.

    We have been running such recall-oriented IR tasks for a year now, in order to build collections of web pages about “offensive” niche topics (eg. hate speech, gambling, violence, etc). Showing the current results to others tends to decrease the diversity (e.g., people see racist jokes in the results and search for more pages like that). You need to have controls like “we have enough of these pages, fetch me something else”

    “With people working independently, each basically has to follow the same path, each has to be optimized in the same way.”: I would rephrase that. Yes, people will spend more time optimizing in the own way. But they will also follow their own paths.

  • 11 jeremy // Dec 1, 2010 at 12:06 pm

    For IR, I suspect that the collaborative approach will results in people finding easier more documents, but at the end most documents will be about a specific aspect of the query, which is reinforced by the early results.

    It doesn’t have to be like that. That’s a function not of collaboration, but of the algorithm used to mediate collaboration.

    Let me try and give a quick example:

    Suppose a team has an information need, and suppose there are 3 aspects to that need. The distribution over those aspects (% of relevant documents that belong to each aspect) are (let’s say) 70%, 25%, and 5%.

    If you’re doing collaborative amplification of relevance feedback, then I agree.. that 70% aspect will receive all the attention (because the early found documents will likely be from that 70% set) and diversity will be lower.

    Similarly, if you let each person follow their own path, independently, then documents from all three aspects will be found. But they will be found, let’s hand-wavingly assume, with a probability of 70%, 25%, and 5%, respectively. I.e., the more common aspect will likely still get the most attention.

    Now, however, with collab search, you can explicitly design a “Foley”-ian complementary relevance feedback algorithm that detects possible topical aspects, sees that one or two of the team members is already working on the 70% aspect, and explicitly drive the other team members toward the 25% and 5% aspects.

    That’s not wisdom of the crowds. Wisdom of the crowds is about each team member repeating the efforts of those who went before, i.e. looking at the same jar of jelly beans and making the same guess about how many there are. “Complementary” collab search is about pushing the team members away from each other, to *ensure* that they follow all possible relevant aspects. Wisdom of crowds is about aggregation of repeated judgments about the same items, over and over again. Complementary collab search is about de-aggregation, to make sure that there aren’t repeated judgments, and every aspect is explored.

    Working alone, people might find all the aspects, but there isn’t the same guarantee. Working traditionally wisdom-of-crowdsy together-but-independent, you also do not have that guarantee.

  • 12 Panos Ipeirotis // Dec 1, 2010 at 10:14 pm

    I see. So, in a sense, you are trying to push people to not be correlated. But you are not satisfied with simple independence. Instead, you strive for “anti-correlation”. Indeed, better.

  • 13 jeremy // Dec 2, 2010 at 10:47 am

    The way I would say it is that you still want people’s activities to be “related” to each other. What I do influences you, and vice versa, in real time. But the influence it has is not one of showing me what you already found or vice versa. The influence is one of pushing one or the other of us to continue exploring. But explore in new places, rather than in places your team member has already looked. (“collaborative exploratory search”)

    The challenge is that correlation/signal amplification/traditional feedback is a fairly well known, straightforward process. But there are hundreds of ways of doing anti-correlation. Finding ones that work, and that generalize well, is an interesting research area.

    See also: http://palblog.fxpal.com/?p=1494

    P.s.: Daniel, sorry for taking over your comment thread. I just wanted to point out that there are many different types of non-independence.

  • 14 Daniel Tunkelang // Dec 2, 2010 at 10:56 am

    No need to apologize! I’m always happy when I can out-source content creation to readers! :-)

    And I think the anti-correlation angle is interesting. Will have to follow up the research. But indeed your argument reinforces with my example of sequential interviewers–independence does not optimize for the total amount of information extracted, especially in cases where the set of interviewers is homogeneous.

  • 15 Information Cascades, Revisited // Jun 12, 2012 at 6:54 am

    [...] couple of years ago, I blogged about an information cascade problem I’d read about in David Easley and Jon Kleinberg‘s textbook on Networks, [...]

Clicky Web Analytics