Categories
General

Danny Sullivan vs. Newspapers

Danny Sullivan has a delightful rant on Daggle, his personal blog, entitled “Google’s Love For Newspapers & How Little They Appreciate It“.  It’s a fun read–though like all rants it could use an editor–and there’s even a fair amount I agree with.

Still, as I noted in a comment, he does steal a few bases. Specifically, he seems to see Google’s engagement with the newspapers as a big favor Google is bestowing on them, when it’s quite clear that Google benefits financially from aggregating brand-name content.

My take: Google and the newspaper industry are in a dysfunctional, co-dependent relationship. The newspaper industry is crying out that the relationship is abusive, but is afraid of breaking up because it no longer knows if it can survive on its own without Google bringing home the traffic.

I agree with Sullivan is that the newspapers need to quit whining and take responsbility for their fate. But it would be nice if the blogosphere didn’t mock them en masse when they’re finally showing signs of trying to do so.

Categories
Uncategorized

What Exactly is the Associated Press Announcing?

The blogosphere is in a tizzy over a press release from the Associated Press that begins as follows:

The Associated Press Board of Directors today announced it would launch an industry initiative to protect news content from misappropriation online.

AP Chairman Dean Singleton said the news cooperative would work with portals and other partners who properly license content – and would pursue legal and legislative actions against those who don‘t.

“We can no longer stand by and watch others walk off with our work under misguided legal theories,“ Singleton said at the AP annual meeting, in San Diego.

As part of the initiative, AP will develop a system to track content distributed online to determine if it is being legally used.

The rest of the press release is about rate reductions and new “Limited” service–none of which are attracting much attention. Rather, everyone from the New York Times to Gawker is treating this press release like a declaration of war.

While the AP’s tone is angry, it seems premature to comment on the substance of their tactics until we know more. Saying they’ll use legal means to fight illegal activity is not only vague, but hardly objectionable in principle. Why don’t we wait to find out what they’re actually planning to do before going medieval on them?

Of course, if I get sued for copying over 100 words of their press release without licensing it, then I suppose I might change my tune.

Categories
General

Wolfram Talks About Wolfram Alpha

I felt pretty lucky to get an early preview of Wolfram Alpha, but real insiders like Rudy Rucker get personal demos from Wolfram himself. Rucker not only published a post about his conversation but also posted a “slightly condensed” hour-long podcast of the interview.

I listened to the whole hour. My biggest surprise is at how much Wolfram emphasizes the natural language interface, which I’d thought from my own preview was less of a focus–and even the system’s weakest link. In fact, when  Rucker asks if there will be a manual offering advanced users a list of expressions akin to those in Mathematica, Wolfram says no, that users are lazy, ontologies never line up, and that the system will figure out what the user means. Of course, I now question my assumption that Wolfram Alpha’s real goal or potential is to act as a service to use in other applications.

At the same, Wolfram says that he envisions “a new field of knowledge-based computing.  Imagine a spreadsheet that can pull in knowledge about the entries.” I heard that and then expected him to explain a vision around an API, but he continued to explain that the interface from a spreadsheet would  use natural language. I don’t know whether Wolfram has actually thought this through, or if he can appreciate the perspective of an application developer depending on consistency from a software service. Maybe I underestimate him. Needless to say, I’m a lot closer to my initial skepticism again.

What is interesting from a non-technical perspective is that Wolfram sees Wolfram Alpha as doing for NKS what word processors and databases did for Turing’s theory of computation, i.e., proving the value of his grand opus through its consumerization. I’m give him credit for putting his credibility on the line this way, but I think he’s taking a big risk here.

The podcast is a bit rambling, but it’s primary source material–and I found it enlightening enough to devote an hour to it (while writing this blog post).  I suggest skimming Rucker’s write-up first, and then deciding whether you have the patience to listen to the whole podcast.

Categories
General

Guy Kawasaki, I’ll Say It

I just saw this post from a week ago by Andrew Goodman on Traffick asking “Is Guy Kawasaki Singlehandedly Ruining Twitter?“. Some context: Guy Kawasaki gave a keynote at the New York Search Engine Strategies conference last week in which he discussed the tactics he uses to “use Twitter as a twool“.

Of course, what galls me, at least if Goodman is reporting his speech accurately, is this:

he castigates people who don’t follow everyone back because they’re arrogant. By not “reciprocating,” non-followers are showing they “don’t care about their followers.”

Well, Kawasaki follows over 100,000 users, so he practices what he preaches. But, as Goodman points out:

The thing about Kawasaki’s follow-back habit is: it’s fake reciprocity. He isn’t actually following. Following everyone back is like the old idea of exchanging links with everyone and anyone, in the hopes of gaming Google. You don’t actually have any hope of really following 100,000 people, so instead, you hide behind TweetDeck and other apps. As Kawasaki points out, he does read all @replies and Direct Messages. But don’t believe that the “purpose of following everyone back is so people can direct message me.” The purpose is to get people used to the idea that a follow should be reciprocated with a follow. That way, folks who go out and follow 200,000 people have a greater chance of being followed by, say, 160,000.

Can you say “attention Ponzi scheme“? I sure can. I may have criticized A-list blogger Loic Le Meur in the past for suggesting that follower count implies authority, but at least he doesn’t play this fake reciprocity game–the 500 people he follows may a bit more than Dunbar recommends, but are at least within the bounds of plausbility.

According to Goodman, Kawasaki kept trying to ingratiate himself by saying “well someone out there is going to say I’m a dick for saying this, but…”. Well, Guy, I’ll be the blowhard and say it, you’re being a dick. Every Ponzi scheme has its winners, and you’ve clearly cashed in on this one. I don’t begrudge you the attention you’ve accumulated. But please have the decency not to give advice that, as Goodman puts it, would turn Twitter into a “digital trailer park”.

Categories
General

Google Already Knows What You’re Thinking

An unsubstantiated assertion I’ve seen repeatedly over the last months is that Google needs to acquire Twitter because Twitter knows what is happening (or what we’re thinking about) now, while Google can only look backwards. The latest version I’ve seen of this argument is from Jeff Jarvis’s post today, entitled “Why Google should want Twitter: Currency“:

Google isn’t good at currency. It needs content to ferment; it needs links and clicks to collect so PageRank can determine its value.

I grant that PageRank isn’t good at currency. But Google doesn’t need to perform link analysis to know what people are thinking about in real time. Google can simply look at its logs to determine what people are searching for–and, in particular, which search terms and phrases are appearing with statistically significant frequency. And Google’s search volume is much higher (and more representative of the online population) than Twitter’s update and search traffic combined.

To be clear, you and I can’t perform that analysis using the tools Google makes available to the general public. But Google can–and I don’t see any reason, other than the fear of raising public concerns about privacy, that Google can’t exploit this data themselves.

What is different about Twitter is that it *does* make the data available to the general public. Twitter exposes Trends as part of its own offering, but it also enables services like Tweetmeme to perform their own analyses to track the hot stories in near-real time. But Google could do something similar and probably better if it wanted to.

I’ve said this before: Twitter is a community (a social network if you prefer), not a search engine. And, if there’s a good reason for Google to entertain acquiring Twitter, it’s probably that Google has a less than stellar track record when it comes to community. But let’s not delude ourselves to think that Google needs Twitter to know what’s on our minds now. They already know.

Categories
General

I’m No Google Fan Boy, But…

I may not be a Google fan boy (start with this post if you’re new here), but the recent column in the Guardian (which, by the way, is one of my favorite Endeca customers) entitled “Google is just an amoral menace” is over the top. In fairness to the Guardian, the column is an opinion piece written by Henry Porter of The Observer, and is hardly representative of the fare I expect from the United Kingdom’s leading liberal voice.

What does Porter tell us? Before he vilifies Google, he goes after Scribd, a popular document sharing website. I’m partial to SlideShare myself, but Scribd is significantly more popular. Porter excoriates Scribd for not doing enough to combat unauthorized reproduction of copyrighted materials:

The point is that even if Scribd removes books, it still allows individuals to advertise services for delivering pirated books by email, which must make it the enemy of every writer and publisher in the world. In effect it has turned copyright law on its head: instead of asking publishers for permission, it requires them to object if and when they become aware of a breach.

I understand how publishers resent file-sharing sites that facilitate digital piracy. Clearly Porter doesn’t feel that laws like the World Intellectual Property Organization Copyright Treaty (implemented in the United States as the DMCA) go far enough–he objects to the “safe harbor” provisions that indemnify an ISP, as long as the ISP responds promptly to infringement allegations. He would like ISPs to be responsible for not publishing unauthorized reproduction, and not just for removing them when publishers complain. He’d probably get along well with the Italian prosecutors who want to throw some Google executives in jail because of a YouTube video.

Indeed, I generally prefer opt-in provisions to opt-out–for example, I’m among the skeptics of Google’s book search settlement. And no, I’m not a Microsoft fan boy either!

But there’s a difference between being a publisher and being an ISP. The safe harbor provision for ISPs is there because ISPs are supposed to be common carriers that provide service to the general public without discrimination. Telephone companies are not liable for slander; snail mail and email providers are not liable for illegal activity conducted through the offline or online post; etc.

Yes, there is contributory copyright infringement. But contributory infringement means that the service provider actually knew or should have known of the infringing activity. It is a doctrine of reactive, not proactive, enforcement.

The point of the safe harbor provision is to ensure that there will be common carriers. Remove it, and there would be a chilling effect on ISPs. You might as well shut down the internet. The only middle ground I can espose is to eliminate anonymity–but that would have a chilling effect where it matters most, on dissidents in repressive regimes. I do think we overuse and abuse online anonymity, but it has its place.

Back to Porter and Google. He tells us:

Google presents a far greater threat to the livelihood of individuals and the future of commercial institutions important to the community. One case emerged last week when a letter from Billy Bragg, Robin Gibb and other songwriters was published in the Times explaining that Google was playing very rough with those who appeared on its subsidiary, YouTube. When the Performing Rights Society demanded more money for music videos streamed from the website, Google reacted by refusing to pay the requested 0.22p per play and took down the videos of the artists concerned.

Huh? Google walked away from commercial terms it found unfavorable, and that makes Google a bully? I actually grant that  Google exercises monopolistic power in some arenas, such as the book search settlement, or in its negotiation with advertisers, but in this case the performers are just whining that Google won’t buy at the price they demand. Unless I’m missing some critical part of the story, it’s the artists who should be mocked for their sense of entitlement. I know that Billy Bragg is a left-wing activist, and perhaps he sees Google as some sort of fascist overlord. But Google surely does not have a monopoly on the distribution of music or music videos, and it’s absurd for artists to feel entitled that Google distribute their wares any pay them a price that the market is unlikely to bear. Unless the idea is to fix the price of music for the general public–in which case, who is being  the fascist?

Porter does make some points that I agree with. His characterization that Google is “a parasite that creates nothing, merely offering little aggregation, lists and the ordering of information generated by people who have invested their capital, skill and time” is a caricature, but not entirely off base. What he’s missing, of course, is that this “creating nothing” is a significant technical feat. But I agree that Google’s relationship to content creators is often parasitic.

And his point about newspaper industry is spot-on:

One of the chief casualties of the web revolution is the newspaper business, which now finds itself laden with debt (not Google’s fault) and having to give its content free to the search engine in order to survive. Newspapers can of course remove their content but then their own advertising revenues and profiles decline. In effect they are being held captive and tormented by their executioner, who has the gall to insist that the relationship is mutually beneficial. Were newspapers to combine to take on Google they would be almost certainly in breach of competition law.

Of course, I blame the newspapers a bit more for getting themselves into this mess–they didn’t have to give their content away for free. But now that they have, they’re trapped in a catch 22: sustaining the relationship devalues their content, while ending the relationship only works if the industry acts in concert.

In summary, Google has its faults, and it’s important to hold those faults up to the light. But Google is not an “amoral menace”, and attacks like these only reinforce the perception that Google critics are intransigent Luddites. Criticism is most effective when it is informed and even-handed.

Note: Ian Betteridge offers more measured (and briefer) analysis at Technovia in “Some quick thoughts about Google versus the newspapers“.

Categories
General

API for TunkRank Scores

I hope that most readers here have had a chance to try out TunkRank. TunkRank is an application Jason Adams built, in response to a challenge to implement a measure that takes a PageRank-like approach to measuring influence on Twitter.

To my delight:

  • TunkRank has become  an influential user on Twitter, with 47 followers, a Twitter Grader score in the  80th percentile, and a TunkRank score in the 83rd percentile.
  • The TunkRank page has a Google PageRank of 4–impressive for such a new site! For perspective, this blog has a PageRank of 5.
  • TunkRank has become more than a stand-alone site. It now offers an API so that people can use TunkRank scores in their own applications. Note that the raw TunkRank score (which is what the API gives you) are meaningful without the percentiles, since it models the expected number of users who will view a tweet by that user.

I’ve observed anecdotally that, when two users have similar numbers of followers, TunkRank favors the user who follows fewer users. That is particularly interesting, since the TunkRank measure only looks at the users who follow you, not the users whom you follow.

This hypothesis is consistent with my claim that users who follow a lot of other users generally participate in a culture of reciprocity (or, to put it less gently, an attention Ponzi scheme) that leads to their obtaining followers who themselves follow a lot of other users. A user’s follower-to-following ratio signals the likelihood that a user is to reciprocate if you follow him or her.

I suspect that the expectation of reciprocation is negatively correlates to a user’s TunkRank (and, in my view, influence), and that the best test for this hypothesis is to see if, holding follower count constant, the follower-to-following ratio correlates positively to TunkRank.

In any case, I’m excited about the progress, and again congratulate Jason for making this a reality.

Categories
Uncategorized

Google Preferred Sites

I just read over at Micro Persuasion that the Google’s experimental “preferred sites” feature is now available to all users via Google Labs. The feature allows users to specify web sites that receive preferential treatment in their personal search  results.

I like the feature in theory–it moves some of the control from Google’s black-box relevance ranking algorithm to the user, which is where I think it belongs. I would like to know exactly what “preferred” means–I still am wary of opaque personalization algorithms.

In practice? Well, I can’t tell you, because I couldn’t get the “My preferred site” links to show up, either on Firefox or Internet Explorer. Would be curious to hear from others who have.

Here’s a screen shot from Google:

It debuted in January but at the time was not available to all users. Now anyone can sign up for the feature via Google Labs.

Categories
General

Usability Begins At Home

As some of you may know, I have a muti-faceted identity: I have an awesome day job; I’m a prolific blogger (really?), I’m writing a book; and I have a wonderful family. Of course, it’s easy to forget that all of these facets co-exist, and that not everything stays put within its original context.

Well, the other night, I was talking to my wife (who works in the softwear industry) about the chapter in my faceted search book that addresses user interface design challenges. Specifically, one of the concerns is how to most effectively implement a search box on a site that uses faceted search. There are, of course, a variety of decisions about default behavior and options for configuring alternative behavior. But I did make a strong admonition: including more than one search box on the site will confuse users.

The following day, she sent me an email:

Hey smarty pants.

Am I mistaken, or do I see TWO search boxes on The Noisy Channel?

Ouch. After I’d gotten over the shock that my wife not only pays attention to my rambling but also reads my blog, it took me a moment to realize what she meant–the PostRank widget has its own search box. And it actually has different search behavior than the search box that is built in to WordPress. For example, a search for don’t know using the WordPress-supplied search box returns over six pages of results, while entering these words into the widget (at least prior to this post) leads to “No posts found.”  Moreover, while the result ranking from the WordPress search is strictly by recency, the result ranking on the widget is based on engagement metrics (see the recent discussion on Twitter).

Since I can’t afford to run a formal usability test, I offer the question to you, my readers: are the two search boxes confusing? Have you ever even noticed them before? And now that you’re aware of them, would you recommend I make any changes to the site? Please bear in mind that I probably cannot customize either widget–I’m just a lazy blogger and use the best parts I can get off the shelf.

Also, my wife has graciously offered a discount on her employer’s wares (wears?) to Noisy Channel readers. Go to the Carole Hochman site and use noisy as a promo code to get 40% off of any non-sale items. The discount expires on May 10th. For the calendar-impaired, that’s Mother’s Day. 🙂

Categories
Uncategorized

Data Mining Case Studies Workshop and Practice Prize

I was recently alerted about the Data Mining Case Studies Workshop and Practice Prize:

The Data Mining Case Studies Workshop and Practice Prize was established to showcase the very best in data mining case deployments. Data Mining Case Studies continues into its third year, to be held at KDD2009. Data Mining Case Studies will highlight data mining implementations that have been responsible for a significant and measurable improvement in business operations, or an equally important scientific discovery, or some other benefit to humanity.

The site states a final submission deadline of April 8th, but one of the organizers told me that they are willing to offer extensions. Further details and contact information are available at http://www.dataminingcasestudies.com/.