Exploring Semantic Means

I gave a talk last week at the New York Semantic Web Meetup entitled “exploring semantic means“, and I thought readers here might want to peruse the slides. You can see more pictures of the event here, as well as the slides Ken Ellis presented about the work he’s doing at Daylife. I was also interviewed for a few minutes after the talk; I’ll post a link to the podcast when it’s available.

By Daniel Tunkelang

High-Class Consultant.

8 replies on “Exploring Semantic Means”

Hi Daniel,
So is slide 25 the “ask me in a month..” answer to my original question?

It’s certainly nice, but my original question was less about the specifics of bank bonuses and more about how fuzzier styles of query (using example documents as queries) can sit with a faceted search interface when all results are far from being equally relevant.
I suppose your start query could have been a document and not a keyword but the key here (unlike in previous faceted interfaces of yours I’ve seen) is NOT to show any totals for the number of matches in suggested groups. The numbers would be meaningless because they imply that you have either considered all results (including the very low-similarity matches) or you have introduced an arbitrary relevance cut-off that makes a nonsense of any precision in group totals.
Is the technology you show here exhaustive or sample-based in its grouping of results? The former would preclude a fuzzy “like this doc/para” style of query.


Indeed, and you didn’t even have to wait a month! I assumed that you weren’t concerned about the specifics of bank bonuses, but rather about the ability to satisfy information needs like the one you described. And I hope the interface there does take a step in the right direction.

You’re right that they aren’t showing numbers–that’s a design choice, but the technology is based on actual results, not something fuzzy. I’m not entirely sure how they are ranking them–but there’s no rule that you have to rank facet values based on their frequency in the result set, particularly if the results involve a search that favors recall over precision. That’s a topic for broader discussion, and I promise to post about it.


>>I assumed you[ were].. concerned about the ability to satisfy information needs like the one you described

Yes, but specifically using the retrieval technique I suggested- “given an example document”. Do I detect that you do not subscribe to that approach to specifying queries?


I’m not morally opposed to it, but it’s not my preferred entry point into the information seeking process–at least not with a browser-based search engine. In general, it strikes me as making the problem harder than it has to be–turning the query into a similarity query on a document. I see that approach being more useful when it is later in the process, e.g., I like this document, show me more like it. Better still when the basis for similarity is transparent.


>>it strikes me as making the problem harder than it has to be

Harder for who? The user trying to specify the information need, the system that processes it or the user interpreting results?


I went on an Autonomy course many years ago. The mantra was keywords are “legacy” and that we should use example documents/passages to express intent.
As you suggest, I felt this was like trying to have a conversation with someone where you couldn’t speak for yourself but had to select a book from a shelf that best represented your intent.
I think the technique can have its place but I see issues trying to combine that with faceted result UIs.


[…] Now that the cat is out of the bag, I’m proud to tell readers here about an effort I’ve been involved with over the past few months. As reported in TechCrunch and Search Engine Land,  the Financial Times just launched Newssift, a semantic search engine, powered by Endeca, that sifts through business news. Regular readers may recognize the application from an example I used in my presentation on exploring semantic means. […]


Comments are closed.