Categories
General

Challenge: Blog + Twitter vs. Aardvark

I asked Aardvark the following question this afternoon:

Trying to track down an animated short where a bunch of critters invent a machine to discover where they are, only to learn that they are a dream inside someone’s head. They ultimately turn into pink flamingos as the dream evolves. I remember them all chanting “Flick the switch!” when the invention is unveiled. No luck tracking it down with my web searching skills. 😦
The correct answer came within 6 hours. I’m curious if anyone who reads this message will find it independently–without using Aardvark themselves. If not, I’ll be forced to give Aardvark a very glowing review for answering a question that has been plaguing me for years. One way or the other, I’ll post the answer tomorrow night.
Added: Check out the rematch!
Categories
General

Clay Shirky: Save Society, Not Newspapers

There is so much writing about the impeding demise of the newspaper industry that’s it’s becoming easy to tune it out. But it’s refreshing to see a cutting analysis like the one Clay Shirky makes in “Newspapers and Thinking the Unthinkable“.

He starts with an anecdote about how, in the early 90s, the Knight-Ridder newspaper chain was fighting the unauthorized online distribution of a Dave Barry column. He quotes Gordy Thompson, who then managed internet services at the New York Times: “When a 14 year old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you got a problem.” His most cutting remark:

The newspaper people often note that newspapers benefit society as a whole. This is true, but irrelevant to the problem at hand; “You’re gonna miss us when we’re gone!” has never been much of a business model.

His main point is this: “Society doesn’t need newspapers. What we need is journalism.” And he simply doesn’t see newspapers surviving as the way to deliver journalism.

I’m not so sure, but I think Shirky has his priorities right. We should be worrying about the end, not the means. I don’t subscribe to his and Jeff Jarvis’s faith-based optimism; I’m not convinced that the market demands it enough. Witness the challenge of sustaining other public goods, such as education and energy conservation. My own view is that journalism’s best hope is monetizing participation. I’m actually in the middle of writing an article about it; I’ll let you know if and when it gets published.

Categories
General

Vivisimo, Please Keep It Real

Let me preface this post with a clear disclaimer: I am the Chief Scientist of Endeca, a leading enterprise search vendor, but the views I express on this blog, including those about Endeca’s customers, partners, and competitors, are my own.

And one of my strong personal opinions is that marketing campaigns should be honest. One of the blogs I read is Search Done Right, a corporate blog maintained by Vivisimo.  Regardless of my opinions about Vivisimo’s technology, I am a fan of their marketing department. They’ve achieved visibility disproportionate to their market share–in no small part by promoting interface innovation. I also have met their CEO and CTO, and I think they’re both great guys.

But today, I saw a post entitled “New Enterprise Search Pundit on the Scene?“, which I copy in full below:

When I logged into my Vivisimo email this morning, I had a message from a guy named Stan. He must have gotten my contact info from this blog or the LinkedIn Enterprise Search Professionals group as someone interested in search and sent me the manifesto below since I don’t recall meeting him at a recent Gartner PCC show or an Enterprise Search Strategy (ESS) event. At any rate, I bookmarked his site and will let you know when it goes live – should be interesting. Maybe he has the makings of a search industry pundit in him…

I don’t know about you guys, but I’m getting really sick of not being able to find things at my company. I mean, I have a hard enough time finding my coffee cup each morning because someone in Marketing (I know it’s you, Jan) keeps moving it. But when it comes to data, it’s just impossible to find anything. Think about it for a second. On when I’m on the Internet, I do a Google search for “product requirements document” and get more than 22 million results in less than half a second. Try doing that on your intranet. I mean seriously. Can you even find a search box on your intranet? My last company didn’t even have one.

And when there is a box, what does it really search? A couple of intranet pages? What happens if you have multiple intranets? I bet your search only looks at one. Then you have to use a separate desktop search. And another search for email. And another for your external website. All of this searching is stopping us from getting any work done.

Then I heard about this enterprise search thing. Frankly, it sounds too good to be true. Searching across multiple repositories from a single search box. Presenting results into topical clusters. Tagging and rating documents to impact future search relevancy. Sharing results with other users. All of this while respecting my individual security rights. Seems like pie in the sky to me. Is anybody actually doing this, or is it just some marketing hype?

So I’m starting my own website, www.meetstan.com, to figure things out for myself. The site will be up March 16. Come by and tell me what you think.

Curious, I went to Whois.com and looked up meetstan.com. As you can verify for yourself, the registrant is none other than Vivisimo.

Am I naive to be shocked at this sort of marketing gimmick? Perhaps. But I’m sensitive because there aren’t that many people who understand enterprise search, and there’s a lot of concern about analysts offering less than independent opinions. I assume that Vivisimo isn’t planning to the site to promote a shill analyst, but rather is using this blog post  to create pre-launch buzz around a marketing portal.

Please, Vivisimo, don’t play these sorts of games. Given how uniformed so many people are about enterprise search, many people are likely to take a hoax like this seriously. That’s bad for Vivisimo’s reputation, and for the field as a whole. I’m sure this was an innocent mistake. I hope it will be quickly resolved, and that you will go back to promoting your technology and vision without resorting to such gimmicks.

I also encourage anyone from Vivisimo to comment here and offer clarification.

Categories
General

I Have 10 Aardvark Invites

The kind folks at Aardvark appreciated my write-up and sent me an invite, and, as per the usual viral rules, that means I can invite 10 more people. Let me know if you’re interested via the comments. I might be offline much of the day, but I’ll process requests in the order received.

Categories
General

Is the Aardvark a Social Animal?

A colleague alerted me to Aardvark, a social search service, scheduled to launch during SXSW, that offers users to ask question via instant messenger or email and receive live answers from your social network. Check out recent coverage by John Batelle and ReadWriteWeb.

The initial press is quite positive. In particular, ReadWriteWeb compares it favorably to asking questions on Twitter:

In our internal tests, we realized that a lot of the answers often rivaled those we received when asking our Twitter network. Thanks to the fact that Aardvark automatically routed our questions to people with the right expertise, all the answers we received so far were top-notch. In case you didn’t like the answer (or if it was obscene), you can flag it and rate it on the service’s website.

I haven’t experienced the service, so I’m in no position to evaluate it. I can’t say I’ve been overwhelemed with social question answering on Google (R.I.P.), Yahoo, or LinkedIn. Asking questions on Twitter works well for me, but that’s probably because I have a substantial number of real, knowledgeable followers (the TunkRank is strong with this one!).

But what I’m not understanding is Aardvark’s incentive system. I’ve looked at their blog and white paper, but I don’t see any mention of tangible or intangible incentives. Perhaps the incentives are reptuation and the interaction itself.

In any case, I’m cautiously optimistic. If anyone has managed to get an invite and can share, I’d greatly appreciate a chance to try it out.

Categories
General

Media Cloud: Watch, Analyze, Learn

A couple of months ago, Tom Tague, who leads the Calais initiative at Thomson Reuters, presented at the New York Semantic Web Meetup. One of the projects he alluded to was announced today and reported in ReadWriteWeb: “Media Cloud Leverages Calais to Track News Trends“:

Media Cloud, a new project from the Berkman Center at Harvard University, has an ambitious goal: It will do the heavy lifting of analyzing stories from thousands of traditional news sources, analyzing the semantics of the content through Calais (covered here and here), and then providing tools to quickly get trending results.

The article also points to an interview of project developer Ethan Zuckerman by the Neiman Journalism Lab.

What particularly excites me about this project is the possiblity of comparing how different news organizations–or, better yet, different clusters of similarly biased news organizations–select and cover news. Ever since hearing  Miles Efron present “The Liberal Media and Right-Wing Conspiracies: Using Cocitation Information to Estimate Political Orientation in Web Documents“ at CIKM 2004, I’ve been waiting for someone to take the next step and build analysis tools to compare the media “conspiracies”. For example, what stories are covered in the New York Times, but not in the National Review–and vice versa? Which details appear only in papers associated with one end of the political spectrum?

I don’t know that most people care about these questions. In fact, I suspect they don’t; my experience is that few people are interested in hearing viewpoints that challenge their own. But I fear that we are being personalized to death–that our control over what we read leads to the unfortunate behavior that we only let content through the filter if it reinforces our prejudices.

I know that Media Cloud won’t solve this problem on its own. But at least it’s a great tool for those who do want to broaden their perspectives, and I have hope that intellectually honest people will have the courage to learn from it.

Categories
General

Making Ads More Interesting…for Users or for Google?

Google annouced today that:

We think we can make online advertising even more relevant and useful by using additional information about the websites people visit. Today we are launching “interest-based” advertising as a beta test on our partner sites and on YouTube. These ads will associate categories of interest — say sports, gardening, cars, pets — with your browser, based on the types of sites you visit and the pages you view. We may then use those interest categories to show you more relevant text and display ads.

They do realize that this announcement raises lots of hackles in a world that is increasingly distrustful of Google’s accumulation of data and its control over so much of our online experience. They offer the following as grounds for trusting them:

  • Transparency – We already clearly label most of the ads provided by Google on the AdSense partner network and on YouTube. You can click on the labels to get more information about how we serve ads, and the information we use to show you ads. This year we will expand the range of ad formats and publishers that display labels that provide a way to learn more and make choices about Google’s ad serving.
  • Choice – We have built a tool called Ads Preferences Manager, which lets you view, delete, or add interest categories associated with your browser so that you can receive ads that are more interesting to you.
  • Control – You can always opt out of the advertising cookie for the AdSense partner network here. To make sure that your opt-out decision is respected (and isn’t deleted if you clear the cookies from your browser), we have designed a plug-in for your browser that maintains your opt-out choice.

Despite the predictable reactions from privacy groups, I don’t know that I find behaviorally targeted ads any worse than ads in general. Indeed, Google is probably right that that users will find the ads more relevant–indeed, they have every incentive to increase click-through rates. Privacy groups are right to call out Google’s hypocrisy in changing its tune on behavioral advertising, but so what? If Google’s going to live and die by the ad-supported model and if the overwhelming majority of the online population is on board with it, then, then it’s to be expected that Google will optimize for ad revenue.

Of course, my idea of choice and control is to use an ad blocker (specifically, the CustomizeGoogle Firefox extension), and I think Google takes a very narrow view of transparency. Still, I’m amused that Google is drawing so much heat for what seems to me a minor, incremental change.

Well, a minor change for users. Perhaps it’s not a coincidence that Google’s stock is up 3% today. $3B in market cap is a signifiant increment, even for Google.

Categories
General

Exploring Semantic Means

I gave a talk last week at the New York Semantic Web Meetup entitled “exploring semantic means“, and I thought readers here might want to peruse the slides. You can see more pictures of the event here, as well as the slides Ken Ellis presented about the work he’s doing at Daylife. I was also interviewed for a few minutes after the talk; I’ll post a link to the podcast when it’s available.

Categories
General

The Guardian Gets Openness

Now that the Guardian Open Platform is live, I wanted to share some first impressions. Full disclosure: the Guardian is an Endeca customer. Still, my impressions are my own.

What the Guardian has released are a Content API and a Data Store, sets of publicly-available data made available for free. Here is the gem:

The APIs will feature ‘full fat’ feeds with full articles and other content including video, audio and photo galleries, some one million pieces of content published on guardian.co.uk from 1999-2008.

Of course, the Guardian’s decision to open up its APIs opens up inevitable comparisons to the New York Times for its recent opening up. But I think the Guardian is taking its effort a significant step further. The New York Times has only released its full archival content under non-commercial terms. Its article search and newswire APIs are nice, but they aren’t full fat feeds. Perhaps the closest comparison would be  to Reuters Spotlight–but that is a non-commercial effort.

What the Guardian has done right is to offer openness in the context of commercial use. Here is the relevant section of their terms and conditions:

8. Advertising and Commercial Use

(a) If requested, you will as a condition of your licence to publish OPG Content, display on Your Website any advertisement that we supply to you together with the relevant OPG Content. You shall comply with our instructions regarding the position, form and size of such advertisements on Your Website. Such instructions may be notified to you directly or posted on the OPG Site.

(b) You may attach third party advertising to Your Website, which includes OPG Content, without accounting to us for any share in the revenue generated by such advertising, provided that:
• You do not associate OPG Content, directly or indirectly, with advertisements or advertisers that could be regarded by us as illegal or discriminatory.
• You comply with any additional restrictions that we may introduce from time to time as part of the OPG Terms.

(c) You may not syndicate or otherwise charge a fee for access to OPG Content.

That strikes me as eminently reasonable.

I’ve been looking forward to this launch for a while–unfortunately, my inside knowledge meant that I couldn’t be entirely open myself! But today I’m proud to see the Guardian continuing its tradition of leading the way in online media.

Categories
General

A New Kind of Marketing (NKM)

The blogosphere is a buzz with hype about Wolfram Alpha. Stephen Wolfram writes:

It’s going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.

We’re all working very hard right now to get Wolfram|Alpha ready to go live.

I think it’s going to be pretty exciting. A new paradigm for using computers and the web.

That almost gets us to what people thought computers would be able to do 50 years ago!

And Nova Spivack shares his own excitement:

Stephen was kind enough to spend two hours with me last week to demo his new online service — Wolfram Alpha (scheduled to open in May)….

In a nutshell, Wolfram and his team have built what he calls a “computational knowledge engine” for the Web. OK, so what does that really mean? Basically it means that you can ask it factual questions and it computes answers for you….

Think about that for a minute. It computes the answers. Wolfram Alpha doesn’t simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions.

I haven’t seen this much excitement about a search-related product since the pre-launches of Cuil and Powerset, and we know how those played out. In fairness to Wolfram, however, he did bring us Mathematica, which is more than a legitimate claim to fame.

However, I’m not so persuaded by his more recent accomplishment of publishing A New Kind of Science, a best-seller and 1200-page coffee table book.  Here’s what Wikipedia tells us about its critical reception:

NKS received extensive media publicity for a scientific book, generating scores of articles in such publications as The New York Times, Newsweek, Wired, and The Economist. It was a best-seller and won numerous awards. NKS was reviewed in a large range of scientific journals. Several themes emerged. On the positive, many reviewers enjoyed the quality of the book’s production, and the clear way Wolfram presented many ideas. Many reviewers, even those who engaged in other criticisms, found aspects of the book to be interesting and thought-provoking. On the negative, many reviewers criticized Wolfram for his lack of modesty, poor editing, lack of mathematical rigor, and the lack of immediate utility of his ideas. Concerning the ultimate importance of the book, a common attitude was that of either skepticism or “wait and see”.

If Wolfram has built a breakthrough tool to support  information seeking, then he should let it prove itself by unveiling it and letting other people test it. We aren’t talking about some kind of esoteric science where only a few intellectuals can hope to understand it. Rather, his product purports to be some kind of search / answer / knowledge engine. It’s 2009, and we’re all used to the general vision. What we’re holding our breath for is execution.

I’m open to the possibility that Wolfram has built something that will change the world. But I’m extemely skeptical, and this hype campaign hardly instills confidence. Apparently he told Nova that the product will be launched in May. Two months: not so long to wait to see how well reality matches the hype.