Categories
General

The Guardian Gets Openness

Now that the Guardian Open Platform is live, I wanted to share some first impressions. Full disclosure: the Guardian is an Endeca customer. Still, my impressions are my own.

What the Guardian has released are a Content API and a Data Store, sets of publicly-available data made available for free. Here is the gem:

The APIs will feature ‘full fat’ feeds with full articles and other content including video, audio and photo galleries, some one million pieces of content published on guardian.co.uk from 1999-2008.

Of course, the Guardian’s decision to open up its APIs opens up inevitable comparisons to the New York Times for its recent opening up. But I think the Guardian is taking its effort a significant step further. The New York Times has only released its full archival content under non-commercial terms. Its article search and newswire APIs are nice, but they aren’t full fat feeds. Perhaps the closest comparison would be  to Reuters Spotlight–but that is a non-commercial effort.

What the Guardian has done right is to offer openness in the context of commercial use. Here is the relevant section of their terms and conditions:

8. Advertising and Commercial Use

(a) If requested, you will as a condition of your licence to publish OPG Content, display on Your Website any advertisement that we supply to you together with the relevant OPG Content. You shall comply with our instructions regarding the position, form and size of such advertisements on Your Website. Such instructions may be notified to you directly or posted on the OPG Site.

(b) You may attach third party advertising to Your Website, which includes OPG Content, without accounting to us for any share in the revenue generated by such advertising, provided that:
• You do not associate OPG Content, directly or indirectly, with advertisements or advertisers that could be regarded by us as illegal or discriminatory.
• You comply with any additional restrictions that we may introduce from time to time as part of the OPG Terms.

(c) You may not syndicate or otherwise charge a fee for access to OPG Content.

That strikes me as eminently reasonable.

I’ve been looking forward to this launch for a while–unfortunately, my inside knowledge meant that I couldn’t be entirely open myself! But today I’m proud to see the Guardian continuing its tradition of leading the way in online media.

Categories
Uncategorized

Find Out More about the Guardian Open Platform!

If you’re reading this, then it’s at least 9am GMT, and you should be able to learn more about the Guardian Open Platform at  http://www.guardian.co.uk/open-platform. That’s a bit early on this side of the pond, but I promise to share more details and impressions once I’m awake and have a chance to gather them!

Categories
Uncategorized

Guardian Launching Open Platform

The Guardian, an internationally acclaimed newspaper (and a long-time Endeca client!) that has been a major force in the United Kingdom for 180 years, is launching an open platform tomorrow. The Guardian has led the media in openness, making the unprecedented decision last fall to offer the full text of its articles in its RSS feeds.

I’ll report more about the new platform when there are more details than I can publicly share.

Categories
General

A New Kind of Marketing (NKM)

The blogosphere is a buzz with hype about Wolfram Alpha. Stephen Wolfram writes:

It’s going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.

We’re all working very hard right now to get Wolfram|Alpha ready to go live.

I think it’s going to be pretty exciting. A new paradigm for using computers and the web.

That almost gets us to what people thought computers would be able to do 50 years ago!

And Nova Spivack shares his own excitement:

Stephen was kind enough to spend two hours with me last week to demo his new online service — Wolfram Alpha (scheduled to open in May)….

In a nutshell, Wolfram and his team have built what he calls a “computational knowledge engine” for the Web. OK, so what does that really mean? Basically it means that you can ask it factual questions and it computes answers for you….

Think about that for a minute. It computes the answers. Wolfram Alpha doesn’t simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions.

I haven’t seen this much excitement about a search-related product since the pre-launches of Cuil and Powerset, and we know how those played out. In fairness to Wolfram, however, he did bring us Mathematica, which is more than a legitimate claim to fame.

However, I’m not so persuaded by his more recent accomplishment of publishing A New Kind of Science, a best-seller and 1200-page coffee table book.  Here’s what Wikipedia tells us about its critical reception:

NKS received extensive media publicity for a scientific book, generating scores of articles in such publications as The New York Times, Newsweek, Wired, and The Economist. It was a best-seller and won numerous awards. NKS was reviewed in a large range of scientific journals. Several themes emerged. On the positive, many reviewers enjoyed the quality of the book’s production, and the clear way Wolfram presented many ideas. Many reviewers, even those who engaged in other criticisms, found aspects of the book to be interesting and thought-provoking. On the negative, many reviewers criticized Wolfram for his lack of modesty, poor editing, lack of mathematical rigor, and the lack of immediate utility of his ideas. Concerning the ultimate importance of the book, a common attitude was that of either skepticism or “wait and see”.

If Wolfram has built a breakthrough tool to support  information seeking, then he should let it prove itself by unveiling it and letting other people test it. We aren’t talking about some kind of esoteric science where only a few intellectuals can hope to understand it. Rather, his product purports to be some kind of search / answer / knowledge engine. It’s 2009, and we’re all used to the general vision. What we’re holding our breath for is execution.

I’m open to the possibility that Wolfram has built something that will change the world. But I’m extemely skeptical, and this hype campaign hardly instills confidence. Apparently he told Nova that the product will be launched in May. Two months: not so long to wait to see how well reality matches the hype.

Categories
Uncategorized

Apologies to Google Reader Users

For some reason that I have still not diagnosed, readers who view the RSS feed for this blog using Google Reader are seeing a handful of bogus entries in the feed–something like this. I don’t know why those entries are showing up, let alone why they seem to congregate at the front of the feed. If anyone has suggestion on how to diagnose or resolve the issue, I’d greatly appreciate it. In the mean time, I apologize for the incovenience and annoyance.

Categories
General

Is Global the New Local?

I was just reading a nice article by Mike Elgan in Computerworld entitled “Why global is the new ‘local‘”.

He starts off by talking about the transformations happening in radio:

“Local” radio stations are going national, and even international. That sounds like an opportunity for the stations — they can now reach a larger potential audience for advertisers. But in reality, it’s a problem. The whole radio business model is built around pandering to local community groups, small businesses, area schools and, above all, local listeners. So how do you pander to the old audience without alienating the new one?

He then goes on to explain how the same problem applies to newspapers:

Now you can get local news anywhere. Look, for example, at Lodi, Calif., a medium-size city of about 63,000 people. (You may recall the town from a 1969 Creedence Clearwater Revival song.)

Search Google News for “Lodi” and there it is: more than 4,000 news stories, organized roughly by importance. Getting Lodi news on Google is faster, cheaper, more comprehensive and, well, better than the local Lodi paper. You can get Lodi news even if you’re in Timbuktu. And, of course, you can get county, state, national and international news everywhere. Even if you’re stuck in Lodi.

And here is the money shot:

What’s really going on is that the Internet is punishing inefficiency.

His analysis strikes me as brutally accurate. As much as I criticize  the ad-supported model in general and Google’s role in devaluing online content in particular, I think that Elgan does a great job of explaining what may be one of the the news industry’s biggest contributions to its own malaise. Indeed, for all of the hype about hyperlocal news, I suspect that the winners in this market will be news providers or aggregators that don’t focus on local news but rather let users find whatever they want.

In an unsuccessful City Council run, Tip O’Neill received the famous advice from his father that “All politics is local.” That was surely true in the 1930s, but the world had changed a bit in seven decades.

Fittingly, Elgan concludes his article:

Nothing is local anymore. And it’s a huge opportunity. The new mantra should be: Cover local events exclusively, but for a global audience.

Categories
Uncategorized

Google’s Marissa Mayer on Privacy vs. Transparency

TechCrunch posted a transcript of a Charlie  Rose interviewing Google Vice President of Search Product and User Experience Marissa Mayer.

Here’s an excerpt I found particularly interesting:

Charlie Rose:
This is a broader philosophical question I want to talk about later. But I mean is there some point in which we know too much about people?

Marissa Mayer:
Well I think that in all cases it’s a tradeoff, right, where you will give you some of your privacy in order to gain some functionality, and so we really need to make those tradeoffs really clear to people, what information are we using and what’s the benefit to them? And then ultimately leave it to user choice so the user can decide. And you have to be very transparent about what information you have about that user and how it’s being used.

Charlie Rose:
But it’s also seems to me clearly a product of age and generation, how willing you are to give up privacy and to allow transparency, clearly.

Marissa Mayer:
Sure, absolutely…

That’s a great attitude. I only which Charlie Rose had fact-checked Google’s actual policy when it comes to transparency. Indeed, Google’s lack of transparency with advertisers, who are its bread and butter, recently cost them $761 and a bunch of bad press. While I’m sure Google can afford the judgment (less than 2.5 shares of GOOG stock at the time of this writing), I hope they see this experience as an opportunity to review their principles.

And, of course, don’t get me started on the lack of transparency in their approach to relevance! For those who haven’t been regular readers, here are two of my recent posts about Google:

Categories
Uncategorized

Can You Digg It?

According to an article in the LA Times, USocial CEO and founder Leon Hill is bragging that they are “”gaming Digg” by letting advertisers buy votes. Sound familiar? When will people figure out that anonymous social voting schemes that don’t offer users control over the social lens are just begging to be gamed?

It’s as Ben Franklin said, “Experience is the best teacher, but a fool will learn from no other.” Emphasis mine.

Categories
Uncategorized

Jason Adams Explains TunkRank

Jason Adams, who recently won the TunkRank implementation challenge, explains on his blog how he implemented TunkRank.com. He implemented the algorithm in Ruby using Merb, MySQL, Capistrano, nginx, and ActiveRecord. For more details, check out his blog!

Note: he just added a follow-up post: The Road Ahead for TunkRank.

Categories
Uncategorized

Craig’s Dissertation on People Search

Craig McDonald (now Dr. Craig McDonald!) just announced that his thesis, The Voting Model for People Search, is available online.

Here is a teaser from the abstract:

The thesis investigates how persons in an enterprise organisation can be ranked in response to a query, so that those persons with relevant expertise to the query topic are ranked first. The expertise areas of the persons are represented by documentary evidence of expertise, known as candidate profiles. The statement of this research work is that the expert search task in an enterprise setting can be successfully and effectively modelled using a voting paradigm. In the so-called Voting Model, when a document is retrieved for a query, this document represents a vote for every expert associated with the document to have relevant expertise to the query topic. This voting paradigm is manifested by the proposition of various voting techniques that aggregate the votes from documents to candidate experts. Moreover, the research work demonstrates that these voting techniques can be modelled in terms of a Bayesian belief network, providing probabilistic semantics for the proposed voting paradigm.