CIKM 2011 Industry Event: Khalid Al-Kofahi on Combining Advanced Search Technology and Human Expertise in Legal Research

This post is part of a series summarizing the presentations at the CIKM 2011 Industry Event, which I chaired with former Endeca colleague Tony Russell-Rose.

The original program for the CIKM 2011 Industry Event featured Peter Jackson, who was chief scientist at Thomson Reuters and author of numerous books and papers on natural language processing. Sadly, Peter died on August 3,2011. Thomson Reuters R&D VP of Research Khalid Al-Kofahi graciously agreed to speak in his place, delivering a presentation on  “Combining Advanced Search Technology and Human Expertise in Legal Research“.

Khalid began by giving an “83-second” overview of the US legal system, laying out the roles of the law, the courts, and the legislature. He did so to provide the context for the domain that Thomson Reuters serves — namely, legal information. Legal information providers curate legal information, enhance it editorially and algorithmically, and work to make legal information findable and explainable in particular task contexts. He then worked through an example of how a case law document (specifically, Burger King v. Rudzewicz), appears in WestLawNext, with annotations that include headnotes, topic codes, citation data, and historical context.

Channelling William Goffman, Khalid asserted that a document’s content (words, phrases, metadata) are not sufficient to determine its aboutness and importance. Rather, we also have to consider what other people say about the document and how they interact with it. This is especially true in the legal domain because of the precedential nature of law. He then framed legal search in terms of information retrieval metrics, stating the requirements as completeness (recall), accuracy (precision), and authority. Not surprisingly, Khalid agreed with Stephen Robertson’s emphasis on the importance of recall.

Speaking more generally, Khalid noted that vertical search is not just about search. Rather, it’s about findability. which includes navigation, recommendations, clustering, faceted classification, collaboration, etc. Most importantly, it’s about satisfying a set of well-understood tasks. And, particularly in the legal domain, customers demand explainable models. Beyond this demand, explainability serves an additional purpose: it enables the human searcher to add value to the process (cf. human-computer information retrieval).

It is sad to lose a great researcher like Peter Jackson from our ranks, but I am grateful that Khalid was able to honor his memory by presenting their joint work at CIKM. If you’d like to learn more, I encourage you to read the publications on the Thomson Reuters Labs page.

By Daniel Tunkelang

High-Class Consultant.