Categories
Uncategorized

Kevin McDonald on Endeca on Freebase

This post from Kevin McDonald triggered all of my web alerts, so I thought I’d share it. It’s an interesting thought.

Categories
General

NRC Report: Data Mining won’t find the Terrorists

According to Declan McCullagh, a just-released U.S. National Research Council report entitled Protecting Individual Privacy in the Struggle Against Terrorists: A Framework for Assessment concludes that automated identification of terrorists through data mining or any other mechanism “is neither feasible as an objective nor desirable as a goal of technology development efforts.”

I haven’t had the time to read through the 352-page report. The committee that wrote the report includes Stanford professor William Perry, former MIT president Charles Vest, and Microsoft researcher Cynthia Dwork. Such a crew undoubtedly realizes that any data mining technique yields false positives. The big questions are whether the data mining techniques are more effective than the alternatives, and whether the using them is consistent with law and policy.

Based on McCullagh’s summary, the report seems to mainly call for oversight and objective evaluation. Nothing controversial there. And, as he wryly notes, Americans may have watched too many episodes of 24 to have a realistic sense of what data mining can and can’t do.

Still, I think we’d be naive to give up entirely on machine learning approaches to fight crime and improve national security. As with all science, we need to subject hypotheses to rigorous, objective testing. But remember, low-tech approaches have false positives too. There is no moral superiority in being a Luddite.

Categories
General

The Data Cloud?

As Paul Miller notes, “the Cloud” is increasingly prevalent in tech conversation these days. As if “cloud computing” weren’t a fuzzy enough term, now we have the “data cloud” which, if I understand Paul correctly, may just be a rebranding of the “semantic web” (itself a bit fuzzy for my tastes). Although it’s not clear to me from the article to what extent the “data cloud” represents a commodified data repository vs. a common framework to link everyone’s data using open standards.

I suppose I’ve been in technology long enough that I shouldn’t be making fun of buzzwords, especially when the movement to the cloud represents a real and positive phenomenon. But the semantic web needs more than rebranding. A quick search turned up this post from last year that lists what Nova Spivack identified as barriers to the adoption of the semantic web:

  1. A lack of tools
  2. Scaling challenges (what if you want to store a trillion+ triples?)
  3. Vision issues (how can we define a practical vision, for the low-hanging fruit?)
  4. Inadequate Content (not enough semantic data available)
  5. No killer apps
  6. Market education

One year later, I’m not sure we’re that much farther along.

Categories
Uncategorized

Enabled Permalinks

On a friend’s advice, I enabled permalinks for posts here. The good news is that the links will be more SEO-friendly and attract oodles of traffic. The bad news is that all posts may appear in your reader as unread. Sorry.

Categories
General

Why I Don’t Worry about the Link Economy

I’ve been seeing an increasing number of mentions in the tech press about the link economy and how it is broken. A few representative quotes:

Jeff Jarvis on the imperatives of the link economy:

  1. All content must be transparent: open on the web with permanent links so it can receive links.
  2. The recipient of links is the party responsible for monetizing the audience they bring.
  3. Links are a key to efficiency.
  4. There are opportunities to add value atop the link layer.

Allen Stern: “it’s clear the link economy is broken

Tim O’Reilly on the perils of sites primarily linking to themselves: “The web is a great example of a system that works because most sites create more value than they capture.”

Charles Cooper on how to fix the broken link economy: “link etiquette is basic to the integrity of the ongoing conversation in the blogosphere.”

In fact, there is an entire blog devoted to “Google juice“, although its page rank of 2 suggest to me that there may be more expert sources on the subject.

Search engine optimization (SEO) has been around for over a decade, playing a key role in the adversarial struggle for higher ranking on Google or its predecessors. But SEO now goes far beyond editing and organizing a site’s content. In an world of blogs, tweeting, and aggregation, the increasingly popular approach is linkbaiting, which means what it sounds like: doing whatever it takes to generate incoming links to a website or blog from other sites.

As a blogger, I understand the desire to attract traffic. Even though I don’t make money from this blog, I write in order to be read, and I’m not averse to spreading some bait to attract readers. I also link generously to other sites, mainly to provide value to my readers, but also to give credit where credit is due. And I’m fully aware that some of the folks I link to see those links as a favor worthy of reciprocity. I don’t complain.

But I can’t help laughing when I hear pronouncements about the link economy and how miserly sites are breaking it by excessive internal linking. Especially when there’s a real economy that is really broken!

Attention will always be a scarce, highly contested resource. Many people will use whatever means they have at their disposal to obtain and in many cases monetize it, ranging from the straightforward (e.g., publishing good content) to the blatantly unethical (e.g., browser hijacking) to the absurdly humorous (remember the subservient chicken?)  Some people will try to create “sticky” sites that emphasize internal linking, while others will create sites that serve primarily as guideposts, sending people away to other destinations as quickly as possible.

Are authors responsible for cultivating a global link economy? Do we need social pressure or even regulation in order to ensure the optimal allocation of attention? In short, I don’t think so. While we need to combat strategies that clearly cross the line into the unethical (and in many cases criminal), I’d be wary to go beyond that. The financial economy may be in need of more effective regulation, but social media seem to be doing just fine.

Besides, the wonderful thing about attention is that there is no switching cost. Give democracy a chance!

Categories
Uncategorized

Information Sharing: A U.S. Government Mandate

Thanks to David Fauth for calling my attention to this news release from the U.S. Office of the Director of National Intelligence: New Policy Makes Information Sharing a Factor in Employees’ Performance Reviews.

It’s certainly a step in the right direction. But what will determine the success of this effort is the implementation of Information Sharing Environment to connect federal, state, local, and tribal governments, the private sector, and foreign allies. The security challenges alone are daunting, but I’m just as concerned about the challenges of getting the right information to the right people. Anyone who thinks that search is 90% solved should think about problems like these!

Categories
Uncategorized

Ask, But Shall Thou Receive Traffic?

I read this morning that Ask had revamped their search engine. I tried it out, since I’ve appreciated how Ask has been more willing than its larger brethren to experiement with interfaces beyond conventional ranked lists. Unfortunately, I didn’t see anything to write home about. But perhaps I’m just not asking the right questions. I’d appreciate commentary from the readership.

Categories
Uncategorized

Blogging at the Greater IBM Connection

Among the various facets of my identity is that I spent a fair amount of my student days at IBM Research. While I’ve kept in touch with IBM colleagues over the years, I’ve recently reconnected with IBM as an institution through their Greater IBM Connection initative. I’ll be occasionally blogging there, starting with this post entitled “Friending vs. Making Friends“.

Categories
General

Why Do I Blog?

Steve Hodson wrote a fun post today entitled “So You Want To Be A Rich And Famous Blogger Eh” in which he tries to classify bloggers who write in order to be read beyond their immediate family and friends. I often forget that most people who blog aim to make money from it–an aim in which I suspect few people succeed. Most writers didn’t make much money (if any!) before there were blogs, and blogs didn’t change the basic rules of attention economics.

If I read Hodson correctly, I’m a Louis Gray kind of blogger: my only “economic” gain from blogging is reputation capital. But my real motivation is that it’s fun. The blogosphere is the Usenet of my school years, all grown up. It also provides a way to share ideas in a far more immediate and permissive forum than peer-reviewed publications. Who wouldn’t want to be a blogger?

What blogging has done is dramatically lowered the cost of publication and the efficiency of reader feedback–the latter including readership statistics and comments. Sure, readers don’t pay money to subscribe to blogs like they (used to) pay for print media, but the scarce quantity has always been attention rather than money. As far back as I can remember, writers write to be read and generally count themselves lucky if they can translate readership into income. Personally, I am fortunate to have a great day job!

So readers, have no fear, you’ll never see ads here. The Noisy Channel is a labor of love.

Categories
Uncategorized

Spam: The Price of Success

An unfortunate indicator of this blog’s success is that I’m seeing an increased volume of spam comments. I’ve installed the Akismet plug-in; hopefully that blocks spam without too many false positivies. Please let me know if you experience any problems.