The Noisy Channel

 

CIKM 2011 Industry Event: Ilya Segalovich on Improving Search Quality at Yandex

November 27th, 2011 by Daniel Tunkelang
Respond

This post is last in a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

The final talk of the CIKM 2011 Industry Event was a talk from Yandex co-founder and CTO Ilya Segalovich on “Improving Search Quality at Yandex: Current Challenges and Solutions“.

Yandex is the world’s #5 search engine. It dominates the Russian search market, where it has over 64% market share. Ilya focused on three challenges facing Yandex: result diversification, recency-specific ranking, and cross-lingual search.

For result diversification, Ilya focused on queries containing entities without any addition indicators of intent. He asserted that entities offer a strong but incomplete signal of query intent, and in particular that entities often call for suggested query reformulations. The first step in processing such a query is entity categorization. Ilya said that Yandex achieved almost 90% precision using machine learning, and over 95% precision by incorporating manually tuned heuristics. The second step is enumerating possible search intents for the identified category in order to optimize for intent-aware expected reciprocal rank. By diversifying entity queries, Yandex reduced abandonment on popular queries, increased click-through rates, and was able to highlight possible intents in result snippets.

Ilya then talked about the problem of balancing recency and relevance in handling queries about current events. He sees recency ranking as a diversification problem, since a desire for recent content is a kind of query intent. A challenge is managing recency-specific ranking is to predict the recency sensitivity of the user for a given query. Yandex considers factors such as the fraction of results found that are at most 3 days old, the number of news results, spikes in the query stream, lexical cues (e.g., searches for “explosion” or “fire”), and Twitter trending topics. He also referred to a WWW 2006 paper he co-authored on extracting news-related queries from web query logs. The results of these efforts led to measurable improvements in click-based metrics of user happiness.

Ilya talked about a variety of efforts to support cross-lingual search. Russian users enter a significant fraction (about 15%) of non-Russian queries, but many still prefer Russian-language results. For example, a search for a company name return that company’s Russian-language home page if one is available. Yandex implements language personalization by learning a user’s language knowledge and using it as a factor in relevance computation. Yandex also uses machine translation to serve results for Russian-language queries when there are no relevant Russian-language results.

Ilya concluded by pitching the efforts that Yandex is making to participate in and support the broader information retrieval community, including running (and releasing data for) a relevance prediction challenge. It’s great to see a reminder that there is more to web search than Google vs. Bing, and refreshing to see how much Yandex shares its methodology and results with the IR community.

Comments Off on CIKM 2011 Industry Event: Ilya Segalovich on Improving Search Quality at Yandex

CIKM 2011 Industry Event: Vanja Josifovski on Toward Deep Understanding of User Behavior on the Web

November 27th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

Those of you who attended the SIGIR 2009 Industry Track had the opportunity to hear Yahoo researcher Vanja Josifovski make an eloquent case for ad retrieval as a new frontier of information retrieval. At the CIKM 2011 Industry Event, Vanja delivered an equally compelling presentation entitled “Toward Deep Understanding of User Behavior: A Biased View of a Practitioner“.

Vanja first offered a vision in which the web of the future will be  your life partner, delivering life-long pervasive personalized experience. Everything will be personalized, and the experience will pervade your entire online experience — from your laptop to your web-enabled toaster.

He then brought us back to the state of personalization today. For search personalization, the low entropy of query intent makes it difficult — or too risky — to significantly outperform the baseline of non-personalized search. In his view, the action today is in content recommendation and ad targeting, where there is high entropy of intent and lots of room for improvement over today’s crude techniques.

How do we achieve these improvements? We need more data, larger scale, and better methods for reasoning about data. In particular, Vanja noted the data we have today — searches, page views, connections, messages, purchases — represents the user’s thin observable state. In contrast, we lack data about the user’s internal state, e.g., is the user jet-lagged or worried about government debt. Vanja said that the only way to get more data is to motivate users by creating value for them with it — i.e., social is give to get.

Of course, we can’t talk about user’s hidden data without thinking about privacy. Vanja asserts that privacy is not dead, but that it’s in hibernation. So far, he argued, we’ve managed with a model of industry self-governance with relatively minor impact from data leaks — specifically as compared to the offline world. But he is apprehensive at the prospect of a major privacy breach inducing legislation that sets back personalization efforts for decades.

Vanja then talked about current personalization methods, including learning relationships among features, dimensionality reduction, and smoothing using external data. He argues that many of the models are mathematically very similar to one another, and it is difficult to analyze the relative merits of the models as opposed to other implementation details of the systems that use them.

Finally, Vanja touched on scale issues. He noted that the MapReduce framework imposes significant restrictions on algorithms used for personalization, and that we need the right abstractions for modeling in parallel environments.

Vanja concluded his talk by citing the role of CIKM as a conference in bringing together the communities that research deep user understanding, information retrieval, and databases. Given the exciting venue for next year’s conference, I’m sure we’ll continue to see CIKM play this role!

ps. My thanks to Jeff Dalton for live-blogging his notes.

 

Comments Off on CIKM 2011 Industry Event: Vanja Josifovski on Toward Deep Understanding of User Behavior on the Web

CIKM 2011 Industry Event: Ed Chi on Model-Driven Research in Social Computing

November 25th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

Given the extraordinary ascent of all things social in today’s online world, we could hardly neglect this theme at the CIKM 2011 Industry Event. We were lucky to have Ed Chi, who recently left the PARC Augmented Social Cognition Group to work on Google+, presenting “Model-Driven Research in Social Computing“.

Ed warned us at the beginning of the talk that his focus would be on work he’d done prior to joining Google. Nonetheless, he offered an interesting collection of public statistics about social activity associated with Google properties: 360M words per day being published on Blogger, 150 years of YouTube video being watched everyday on Facebook, and 40M+ people using Google+. Regardless of how Google has fared in the competition for social networking mindshare, Google is clearly no stranger to online social behavior.

Ed then dove into recent research that he and colleagues have done on Twitter activity. Since all of the papers he discussed are available online, I will only touch on highlights. I encourage you to read the full papers:

Ed talked at some length about language-dependent behavior on Twitter. For example, tweets in French are more likely to contain URLs than those in English, while tweets in Japanese are less likely (perhaps because the language is more compact relative to Twitter’s 140-character limit?). Tweets in Korean are far more likely to be conversational (i.e., explicitly mentioning or replying to other users) than those in English. These differences remind us to be cautious in generalizing our understanding of online social behavior from the behavior of English-speaking users. Ed also talked about cross-language “brokers” who tweet in multiple languages: he sees these as indicating connection strength between languages, as well as giving us insight to improve cross-­language communication.

Ed then talked about ways to reduce information overload in social streams. These included Eddi, a tool for summarizing social streams, and zerozero88, a closed experiment to produce a personal newspaper from a tweet stream. In analyzing the results of the zerozero88 experiment, Ed and his colleagues found that the most successful recommendation strategy combined users’ self-voting with social voting by their friends of friends. They also found that users wanted both relevance and serendipity — a challenge since the two criteria often compete with one another.

Ed concluded by offering the following design rule: since interaction costs determine number of the people who participate in social activity, get more people into the system by reducing interaction cost. He asserted that this is a key design principle for Google+.

My skepticism about Google’s social efforts is a matter of public record (cf. Social Utility, +/- 25%Google±?). But hiring Ed Chi was a real coup for Google, and I’m optimistic about what he’ll bring to the Google+ effort.

ps. My thanks to Jeff Dalton for live-blogging his notes.

Comments Off on CIKM 2011 Industry Event: Ed Chi on Model-Driven Research in Social Computing

CIKM 2011 Industry Event: David Hawking on Search Problems and Solutions in Higher Education

November 22nd, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

One of the recurring themes at the CIKM 2011 Industry Event was that not all search is web search. Stephen Robertson, in advocating why recall matters, noted that web search was exceptional rather than typical as an information retrieval domain. Khalid Al-Kofahi on spoke about the challenges of legal search. Focusing on a different vertical, Funnelback Chief Scientist David Hawking spoke about “Search Problems and Solutions in Higher Education“.

David spent most of the presentation focusing on work that Funnelback did for the Australian National University. Funnelback was originally developed by CSIRO and the ANU under the name Panoptic.

The ANU has a substantial web presence, comprised of hundreds of sites and over a million pages. Like many large sites, it suffers from propagation delay: the most important pages are fresh, but material on the outposts can be stale. Moreover, there is broad diversity of authorship.

The university also has a strong editorial stance for ranking search results: the search engine needs to identify and favor official content. Given the proliferation of unofficial content, it can be a challenge to identify official sites based on signals like incoming link count, click counts, and the use of official style templates.

David described a particular application that Funnelback developed for ANU: a university course finder. The problem is similar to that of ecommerce search and calls for similar solutions, e.g., faceted search, auto-complete, and suggestions of related queries. And, just as in ecommerce, we can evaluate performance in terms of conversion rate.

David ended his talk by touching on expertise finding (a problem I think about a lot as a LinkedIn data scientist!) and showing demos. And, while I no longer work in enterprise search myself, I still appreciate its unique challenges. I’m glad that David and his colleagues are working to overcome those challenges, especially in a domain as important as education.

Comments Off on CIKM 2011 Industry Event: David Hawking on Search Problems and Solutions in Higher Education

CIKM 2011 Industry Event: Ben Greene on Large Memory Computers for In-Memory Enterprise Applications

November 22nd, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

Large-scale computation was, not surprisingly, a major theme at the CIKM 2011 Industry Event. Ben Greene, Director of SAP Research Belfast, delivered a presentation on “Large Memory Computers for In-Memory Enterprise Applications“.

Ben started by defining in-memory computing as “technology that allows the processing of massive quantities of real time data in the main memory of the server to provide immediate results from analyses and transactions”. He then asked whether the cloud enables real-time computing, since there is a clear market hunger for cloud computing to solve the problems of our current enterprise systems.

Not surprisingly, he advocated in-memory computing as the solution for those problems. Like John Ousterhout and the RAMCloud team, he sees the need to scale DRAM memory independently from physical boxes. He proposed a model of coherent shared memory, using high-speed low-latency networks and separating the data transport and cache layers into a separate tier below the operating system. The goal: no server-side application caches, DRAM-like latency for physically distributed databases, and in fact no separation between the application server and the database server.

Ben argued that coherent shared memory can dramatically lower the cost of in-memory computing while minimizing the pain for application developers. He also offered some benchmarks for SAP’s BigIron system to demonstrate the performance improvements.

In short, Ben offered a vision of in-memory computing as a reincarnation of the mainframe. It was an interesting and provocative presentation, and my only regret is that we couldn’t stage a debate between him and Jeff Hammerbacher over the future of large-scale enterprise computing.

1 Comment

CIKM 2011 Industry Event: Chavdar Botev on Databus: A System for Timeline-Consistent Low-Latency Change Capture

November 20th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

I’m of course delighted that one of my colleagues at LinkedIn was able to participate in the CIKM 2011 Industry Event. Principal software engineer Chavdar Botev delivered a presentation on “Databus: A System for Timeline-Consistent Low-Latency Change Capture“.

LinkedIn processes a massive amount of member data and activity. It has over 135M members and is growing faster than two new members per second. Based on recent measurements, those members are on track to perform more than four billion searches on the LinkedIn platform in 2011. All of this activity requires a data change capture mechanism that allows external systems, such as its graph index and real-time full-text search index Zoie, to act as subscribers in user space and stay up to date with constantly changing data in the primary stores.

LinkedIn has built the Databus system to meet these needs. Databus meets four key requirements: timeline consistency, guaranteed delivery, low latency, and user-space visibility. For example, edits to member profile fields, such as companies and job titles, need to be standardized. Also, in order to give recruiters act quickly on feedback to their job postings, we need to be able to propagate the changes to the job description in near-real-time.

Databus propagates data changes throughout LinkedIn’s architecture. When there is a change in a primary store (e.g., member profiles or connections), the changes are buffered in the Databus Relay through a push or pull interface. The relay can also capture the transactional semantics of updates. Clients poll for changes in the relay. If a client falls behind the stream of change events in the relay, it is redirected to a Bootstrap database that delivers a compressed delta of the changes since the last event seen by the client.

In contrast to generic message systems (including the Kafka system that LinkedIn has open-sourced through Apache), Databus has moreinsight in the structure of the messages and can thus do better than just guaranteeing message-level integrity andtransactional semantics for communication sessions.

I tend to live a few levels above core infrastructure, but I’m grateful that Chavdar and his colleagues build the core platform that makes all of our large-scale data collection possible. After all, without data we have no data science.

 

1 Comment

CIKM 2011 Industry Event: Khalid Al-Kofahi on Combining Advanced Search Technology and Human Expertise in Legal Research

November 19th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

The original program for the CIKM 2011 Industry Event featured Peter Jackson, who was chief scientist at Thomson Reuters and author of numerous books and papers on natural language processing. Sadly, Peter died on August 3,2011. Thomson Reuters R&D VP of Research Khalid Al-Kofahi graciously agreed to speak in his place, delivering a presentation on  “Combining Advanced Search Technology and Human Expertise in Legal Research“.

Khalid began by giving an “83-second” overview of the US legal system, laying out the roles of the law, the courts, and the legislature. He did so to provide the context for the domain that Thomson Reuters serves — namely, legal information. Legal information providers curate legal information, enhance it editorially and algorithmically, and work to make legal information findable and explainable in particular task contexts. He then worked through an example of how a case law document (specifically, Burger King v. Rudzewicz), appears in WestLawNext, with annotations that include headnotes, topic codes, citation data, and historical context.

Channelling William Goffman, Khalid asserted that a document’s content (words, phrases, metadata) are not sufficient to determine its aboutness and importance. Rather, we also have to consider what other people say about the document and how they interact with it. This is especially true in the legal domain because of the precedential nature of law. He then framed legal search in terms of information retrieval metrics, stating the requirements as completeness (recall), accuracy (precision), and authority. Not surprisingly, Khalid agreed with Stephen Robertson’s emphasis on the importance of recall.

Speaking more generally, Khalid noted that vertical search is not just about search. Rather, it’s about findability. which includes navigation, recommendations, clustering, faceted classification, collaboration, etc. Most importantly, it’s about satisfying a set of well-understood tasks. And, particularly in the legal domain, customers demand explainable models. Beyond this demand, explainability serves an additional purpose: it enables the human searcher to add value to the process (cf. human-computer information retrieval).

It is sad to lose a great researcher like Peter Jackson from our ranks, but I am grateful that Khalid was able to honor his memory by presenting their joint work at CIKM. If you’d like to learn more, I encourage you to read the publications on the Thomson Reuters Labs page.

Comments Off on CIKM 2011 Industry Event: Khalid Al-Kofahi on Combining Advanced Search Technology and Human Expertise in Legal Research

CIKM 2011 Industry Event: Jeff Hammerbacher on Experiences Evolving a New Analytical Platform

November 16th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

The third speaker in the program was Cloudera co-founder and Chief Scientist Jeff Hammerbacher. Jeff, recently hailed by Tim O’Reilly as one of the world’s most powerful data scientists, built the Facebook Data Team, which is most known for open-source contributions that include Hive and Cassandra. Jeff’s talk was entitled “Experiences Evolving a New Analytical Platform: What Works and What’s Missing“. I am thankful to Jeff Dalton for live-blogging a summary.

Jeff’s talk was a whirlwind tour through the philosophy and technology for delivering large-scale analytics (aka “big data”) to the world:

1) Philosophy

The true challenges in the task of data mining are creating a data set with the relevant and accurate information and determining the appropriate analysis techniques. While in the past it made sense to plan data storage and structure around the intended use of the data, the economics of storage and the availability of open-source analytics platforms argue for the reverse: data first, ask questions later; store first, establish structure later. The goal is to enable everyone — developers, analysts, business users — to “party on the data”, providing infrastructure that keeps them from clobbering one another or starving each other of resources.

2) Defining the Platform

No one just uses a relational database anymore. For example, consider Microsoft SQL Server. It is actually part of a unified suite that includes SharePoint for collaboration, PowerPivot for OLAP, StreamInsight for complex event processing (CEP), etc. As with the LAMP stack, there is a coherent framework analytical data management which we can call an analytical data platform.

3) Cloudera’s Platform

Cloudera starts with a substrate architecture of Open Compute commodity Linux servers configured using Puppet and Chef and coordinated using ZooKeeper. Naturally this entire stack is open-source. They use HFDS and Ceph to provide distributed, schema-less storage. They offer append-only table storage and metadata using Avro, RCFile, and HCatalog; and mutable table storage and metadata using HBase. For computation, they offer YARN (inter-job scheduling, like Grid Engine, for data intensive computing) and Mesos for cluster resource management; MapReduce, Hamster (MPI), Spark, Dryad / DryadLINQ, Pregel (Giraph), and Dremel as processing frameworks; and Crunch  (like Google’s FlumeJava), PigLatin, HiveQL, and Oozie as high-level interfaces. Finally, Cloudera offers tool access through FUSE, JDBC, and ODBC; and data ingest through Sqoop and Flume.

4) What’s Next?

For the substrate, we can expect support for fat servers with fat pipes, operating system support for isolation, and improved local filesystems (e.g., btrfs). Storage improvements will give us a unified file format, compression, better performance and availability, richer metadata, distributed snapshots, replication across data centers, native client access, and separation of namespace and block management. We will see stabilization of our existing compute tools and better variety, as well as improved fault tolerance, isolation and workload management, low-latency job scheduling, and a unified execution backend for workflow. And we will see better integration through REST API access to all platform components, better document ingest, maintenance of source catalog and provenance information, and an integration beyond ODBC with analytics tools. We will also see tools that facilitate that transition from unstructured to structured data (e.g. RecordBreaker).

Jeff’s talk was as information-dense as this post suggests, and I hope the mostly-academic CIKM audience was not too shell-shocked. It’s fantastic to see practitioners not only building essential tools for research in information and knowledge management, but reaching out to the research community to build bridges. I saw lots of intense conversation after his talk, and I hope the results realize the two-fold mission of the Industry Event, which is to give  researchers an opportunity to learn about the problems most relevant to industry practitioners, and to offer practitioners an opportunity to deepen their understanding of the field in which they are working.

5 Comments

CIKM 2011 Industry Event: John Giannandrea on Freebase – A Rosetta Stone for Entities

November 15th, 2011 by Daniel Tunkelang
Respond

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

The second speaker in the program was Metaweb co-founder John Giannadrea. Google acquired Metaweb last year and has kept its promise to to maintain Freebase as a free and open database for the world (including for rival search engine Bing — though I’m not sure if Bing is still using Freebase). John’s talk was entitled “Freebase – A Rosetta Stone for Entities“. I am thankful to Jeff Dalton for live-blogging a summary.

John started by introducing Freebase as a representation of structured objects corresponding to real-world entities and connected by a directed graph of relationships. In other words, a semantic web. While it isn’t quite web-scale, Freebase is a large and growing knowledge base consisting of 25 million entities and 500 million connections — and doubling annually. The core concept in Freebase is a type, and an entity can have many types. For example, Arnold Schwarzenegger is a politician and an actor. John emphasized the messiness of the real world. For example, most actors are people, but what about the dog who played Lassie? It’s important to support exceptions.

The main technical challenge for Freebase is reconciliation — that is, determining how similar a set of data is to existing Freebase topics. John pointed out how critical it is for Freebase to avoid duplication of content, since the utility of Freebase depends on unique nodes in its graph corresponding to unique objects in the world. Freebase obtains many of its entities by reconciling large, open-source knowledge bases — including Wikipedia, WordNetLibrary of Congress Authorities,  and metadata from the Stanford Library. Freebase uses a variety of tools to implement reconciliation, including Google Refine (formerly known as Freebase Gridworks) and Matchmaker, a tool for gathering human judgments. While reconciliation is a hard technical problem, it is made possible by making inferences across the web of relationships that link entities to one another.

John then presented Freebase as a Rosetta Stone for entities on the web. Since an entity is simply a collection of keys (one of which is its name), Freebase’s job is to reverse engineer the key-value store that is distributed among the entity’s web references, e.g., the structured databases backing web sites and encoding keys in URL parameters. He noted that Freebase itself is schema-less (it is a graph database), and that even the concept of a type is itself an entity (“Type type is the only type that is an instance of itself”). Google makes Freebase available through an API and the Metaweb Query Language (MQL).

Freebase does have its challenges. The requirement to keep out duplicates is an onerous one, as they discovered when importing a portion of the Open Library catalog. Maintaining quality calls for significant manual curation, and quality varies across the knowledge base. John asserted that Freebase provides 99% accuracy at the 95th percentile, though it’s not clear to me what that means (update: see Bill’s comment below).

While I still have concerns about Freebase’s robustness as a structured knowledge base (see my post on “In Search Of Structure“), I’m excited to see Google investing in structured representations of knowledge. To hear more about Google’s efforts in this space, check out the Strata New York panel I moderated on Entities, Relationships, and Semantics — the panelists included Andrew Hogue, who leads Google’s structured data and information extraction group and managed me during my year at Google New York.

13 Comments

CIKM 2011 Industry Event: Stephen Robertson on Why Recall Matters

November 14th, 2011 by Daniel Tunkelang
Respond

On October 27th, I had the pleasure to chair the CIKM 2011 Industry Event with former Endeca colleague Tony Russell-Rose. It is my pleasure to report that the program, held in parallel with the main conference sessions, was a resounding success. Since not everyone was able to make it to Glasgow for this event, I’ll use this and subsequent posts to summarize the presentations and offer commentary. I’ll also share any slides that presenters made available to me.

Microsoft researcher Stephen Robertson, who may well be the world’s preeminent living researcher in the area of information retrieval, opened the program with a talk on “Why Recall Matters“. For the record, I didn’t put him up to this, despite my strong opinions on the subject.

Stephen started by reminding us of ancient times (i.e., before the web), when at least some IR researchers thought in terms of set retrieval rather than ranked retrieval. He reminded us of the precision and recall “devices” that he’d described in his Salton Award Lecture — an idea he attributed to the late Cranfield pioneer Cyril Cleverdon. He noted that, while set retrieval uses distinct precision and recall devices, ranking conflates both into decision of where to truncate a ranked result list. He also pointed out an interesting asymmetry in the conventional notion of precision-recall tradeoff: while returning more results can only increase recall, there is no certainly that the additional results will decrease precision. Rather, this decrease is a hypothesis that we associate with systems designed to implement the probability ranking principle, returning results in decreasing order of probability of relevance.

He went on to remind us that there is information retrieval beyond web search. He hauled out the usual examples of recall-oriented tasks: e-discovery, prior art search, and evidence-based medicine. But he then made the case that not only the web not the only problem in information retrieval, but that “it’s the web that’s strange” relative to the rest of the information retrieval landscape in so strongly favoring precision over recall. He enumerated some of the peculiarities of the web, including its size (there’s only one web!), the extreme variation in authorship and quality, the lack of any content standardization (efforts like schema.org notwithstanding), and the advertising-based monetization model that creates an unusual and sometimes adversarial relationships between content owners and search engines. In particular, he cited enterprise search as an information retrieval domain that violates the assumptions of web search and calls for more emphasis on recall.

Stephen suggested that, rather than thinking in terms of the precision-recall curve, we consider the recall-fallout curve. Fallout is a relatively unknown measure that represents the probability that a non-relevant document is retrieved by the query. He noted that fallout offered little practical use in IR, given that the corpus is populated almost entirely by non-relevant documents. Still, he made the case that the recall-fallout trade-off might be more conceptually appropriate than the precision-recall curve in order to understand the value of recall.

In particular, we can generalize the traditional inverse precision-recall relationship to the hypothesis that the recall-fallout curve is convex (details in “On score distributions and relevance“). We can then calculate instantaneous precision at any point in the result list as the gradient of the recall-fallout curve. Going back to the notion of devices, we can now replace precision devices with fallout devices.

Stephen wrapped up his talk by emphasizing the user of information retrieval systems — as aspect of IR that is too often neglected outside HCIR circles. He advocated that systems provide user with evidence of recall, guidance of how far to go down ranked results, and prediction of the recall at any given stopping point.

It was an extraordinary privilege to have Stephen Robertson present at the CIKM Industry Event, and even better to have him make a full-throated argument in favor of recall. I can only hope that researchers and practitioners take him up on it.

5 Comments

Clicky Web Analytics