Categories
Uncategorized

Sales Pitch for the Semantic Web

Thanks to Marco Neumann, who runs the New York Semantic Web Meetup, for alerting me to this presentation by Nova Spivack, whom Marco aptly describes as Chief Director of Sales of the Semantic Web. Enjoy!

http://vimeo.com/moogaloop.swf?clip_id=1062481&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1
Nova Spivack at The Next Web Conference 2008

By Daniel Tunkelang

High-Class Consultant.

4 replies on “Sales Pitch for the Semantic Web”

Hey Daniel-

What I don’t like about Nova’s point of view on the Semantic Web is that it focuses on the WWW. I actually believe that the basic semantic web technologies (OWL, RDF, SPARQL) have a great deal to offer behind the firewall, but that it’s really misguided to try to apply them to the WWW at large. Even his argument here, “WWW1 was infrastructure, WWW2 was UI, WWW3 will be infrastructure again!” is somewhat idealistic; WWW1 was about the ease of publishing and sharing information, even if everything was left-justified and you had to put a spinny GIF on your sight to be kewl. WWW3 infrastructure, on the other hand, is far too complicated for the average developer, which is what will kill (has killed?) its adoption. Honestly, it’s hard enough to get users to just tag full articles, let alone having them mark up XHTML via RDFa! Even from a user’s point of view, the barrier to entry for Twine is significantly higher than that of, say, Wikipedia, or Facebook, or Orkut, or Blogger, etc. etc. You really have to put some time into it before it becomes meaningfully useful.

So, thinking about the enterprise for a minute, there are already centralized schema management tools and MDM, and they focus too much on centralized control, which doesn’t scale to large enterprises (huge projects of these types tend to run over schedule and do not have a measurable ROI). There’s an opportunity there.

The fundamental problem with the semantic web zealots who focus on the WWW is that no one has an incentive to turn their information into RDF. Web sites tend to make money from advertising or from selling something. There is an actual cost to providing information on the web. If I provide a SPARQL endpoint and surface my data as RDF so that others, in a Utopian sharing sort of way, can access it and produce the data mashups that so many in that community have been talking about, what’s in it for me? Honestly? Some data providers, like MLS, can get away since they have a monopoly on their specific domain of information and can charge a premium, but most do not have that luxury.

By contrast, in the enterprise there is a strong incentive to make all information available to everyone around, which is why Endeca continues to do so well in that space. There are a couple semantic web startups playing around there as well, such as those guys at Cambridge Semantics with their Anzo Excel plugin (clever stuff, that), or Thetus, or SchemaLogic.

So my thought is that if the semantic web will happen at all, it will happen in the enterprise first.

I’ll get off my soapbox now πŸ˜‰

-Rob

Like

Rob, you are adorable when you’re ranting. Seriously, I find much to agree with there. As does Bernard Lunn over at ReadWriteWeb: http://www.readwriteweb.com/archives/semantic_web_11_things_to_know.php

A choice quote: “Don’t look for a killer app. That implies a client/consumer win. This is much more likely to be a server/platform/enterprise win. Even if the initial experimentation is done in the consumer domain; Freebase for example looks like a mass Beta test for some enterprise technology that Metaweb wants to release later.”

Like

Daniel,

Lets suppose you have have an RDF statement that says something like this:

Daniel works-for Endeca

When all is said and done, all you have are three strings in a sequence. You still have the grounding problem.

In fact, the one place where you could possibly ground the symbols is within an enterprise.

A digression: Neal Stephenson in his latest masterpiece, Anathem, calls computers “syntactic devices”. A term that I agree with wholeheartedly πŸ™‚

Like

I agree that ungrounded triples are a mess of data crying out for normalization and authority. In fact, one of the successes of Wikipedia is that it enforces, with at least a moderate degree of success, a vocabulary normalization. It’s MDM lite to be sure, but it’s indicative of the work needed to make this semantic stuff work. And then there is the small matter of market incentives to do that work.

Also, by synchronous coincidence, I noticed that someone left a copy of Anathem in the office cafeteria this morning. I’ll have to check it out.

Like

Comments are closed.