That would be IBM Research, for millions of dollars (I suspect). I’ve known about the Jeopardy project for a while from colleagues at IBM, and I’m glad I can finally talk about it publicly, now that it’s been reported in the New York Times.
It’s a great challenge, and I hope IBM can rally around it the way it did for chess. But I’d love to see information retrieval researchers consider a related problem–namely, looking at the results for a query and trying to reverse engineer the query from that set (i.e., without cheating and looking at the query). In order words, I want search engines to do what we as humans do naturally. When I’m not sure I understand you, I repeat back what I think you said, in words I’m sure I understand and that I believe you’ll understand too. It’s a great way to clarify misunderstandings and to make sure we end up on the same page.
This clarification dialogue is a key part of the HCIR vision: establishing shared understanding between the user and the system. And it bears a striking resemblance to the game of Jeopardy. When a user receives results in response to a query, those results should feel like an easy Jeopardy “answer”, for which the “question” jumps out as being compatible with the user’s information need. If that is not the case, then something has broken down in the communication, and the system should work with the user to resolve the breakdown.
I realize that HCIR isn’t quite as sexy as question answering (or is this answer questioning?) and certainly doesn’t have its own household-name game show. Then again, I never imagined that prospect theory and the prisoner’s dilemma would get their own game shows. A researcher can hope!
4 replies on “Who Wants To Play “Jeopardy”?”
Thanks for alerting us to this. Very interesting! Ambiguity, double entendre, puns, and anologies are fascinating, but regular question answering systems still suck pretty hardcore. I wonder if research on the former will help the latter?
BTW, one of the things I find most interesting about these systems (unlike chess-playing programs) is that they make mistakes that humans would think are completely ridiculous. I think success in this space shouldn’t just be defined by beating humans, but by not making inane mistakes (e.g., when a question clearly calls for a thing of class X, the system gives an answer in class Y).
People aren’t necessarily very good at guessing the original query from a sampling of the relevant documents.
In the TREC query track, the goal was to get very many different expressions (“queries”) of the same information need (“topic”) for topics that had already been used in an ad hoc task. One of the ways of getting new queries was to show someone unfamiliar with the original topic statement a sampling of five or so relevant documents and have them write a query for which those documents would be relevant. Some of those back-fitted queries are quite dissimilar from the original topic statement.
Mark, that’s a good point. Though I suspect the early chess-playing programs made stupid mistakes too, at least from what I read of early evaluation heuristics.
Ellen, you’re right that I may be raising the bar too high. But I’d like, at a minimum, for the system to be able to express the query it believes the offered results answer, so that a user can compare that back-fitted query to his or her information need. The back-fitted query may be different that the original one, but I’d hope the user would be able to make the connection. Or perhaps the serendipity of the connection would itself be useful. In any case, I’d start by setting what I believe is an achievable goal: requiring that search results offer at least one user-understandable query that can be reverse engineered from them. That would be a big step toward transparency.
[…] some of its own researchers to the program, including David Ferrucci, who has been leading the Jeopardy project. There’s even an “Ignite-style” session where all attendees will have the […]