Categories
General

The Data Cloud?

As Paul Miller notes, “the Cloud” is increasingly prevalent in tech conversation these days. As if “cloud computing” weren’t a fuzzy enough term, now we have the “data cloud” which, if I understand Paul correctly, may just be a rebranding of the “semantic web” (itself a bit fuzzy for my tastes). Although it’s not clear to me from the article to what extent the “data cloud” represents a commodified data repository vs. a common framework to link everyone’s data using open standards.

I suppose I’ve been in technology long enough that I shouldn’t be making fun of buzzwords, especially when the movement to the cloud represents a real and positive phenomenon. But the semantic web needs more than rebranding. A quick search turned up this post from last year that lists what Nova Spivack identified as barriers to the adoption of the semantic web:

  1. A lack of tools
  2. Scaling challenges (what if you want to store a trillion+ triples?)
  3. Vision issues (how can we define a practical vision, for the low-hanging fruit?)
  4. Inadequate Content (not enough semantic data available)
  5. No killer apps
  6. Market education

One year later, I’m not sure we’re that much farther along.

Categories
General

Why I Don’t Worry about the Link Economy

I’ve been seeing an increasing number of mentions in the tech press about the link economy and how it is broken. A few representative quotes:

Jeff Jarvis on the imperatives of the link economy:

  1. All content must be transparent: open on the web with permanent links so it can receive links.
  2. The recipient of links is the party responsible for monetizing the audience they bring.
  3. Links are a key to efficiency.
  4. There are opportunities to add value atop the link layer.

Allen Stern: “it’s clear the link economy is broken

Tim O’Reilly on the perils of sites primarily linking to themselves: “The web is a great example of a system that works because most sites create more value than they capture.”

Charles Cooper on how to fix the broken link economy: “link etiquette is basic to the integrity of the ongoing conversation in the blogosphere.”

In fact, there is an entire blog devoted to “Google juice“, although its page rank of 2 suggest to me that there may be more expert sources on the subject.

Search engine optimization (SEO) has been around for over a decade, playing a key role in the adversarial struggle for higher ranking on Google or its predecessors. But SEO now goes far beyond editing and organizing a site’s content. In an world of blogs, tweeting, and aggregation, the increasingly popular approach is linkbaiting, which means what it sounds like: doing whatever it takes to generate incoming links to a website or blog from other sites.

As a blogger, I understand the desire to attract traffic. Even though I don’t make money from this blog, I write in order to be read, and I’m not averse to spreading some bait to attract readers. I also link generously to other sites, mainly to provide value to my readers, but also to give credit where credit is due. And I’m fully aware that some of the folks I link to see those links as a favor worthy of reciprocity. I don’t complain.

But I can’t help laughing when I hear pronouncements about the link economy and how miserly sites are breaking it by excessive internal linking. Especially when there’s a real economy that is really broken!

Attention will always be a scarce, highly contested resource. Many people will use whatever means they have at their disposal to obtain and in many cases monetize it, ranging from the straightforward (e.g., publishing good content) to the blatantly unethical (e.g., browser hijacking) to the absurdly humorous (remember the subservient chicken?)  Some people will try to create “sticky” sites that emphasize internal linking, while others will create sites that serve primarily as guideposts, sending people away to other destinations as quickly as possible.

Are authors responsible for cultivating a global link economy? Do we need social pressure or even regulation in order to ensure the optimal allocation of attention? In short, I don’t think so. While we need to combat strategies that clearly cross the line into the unethical (and in many cases criminal), I’d be wary to go beyond that. The financial economy may be in need of more effective regulation, but social media seem to be doing just fine.

Besides, the wonderful thing about attention is that there is no switching cost. Give democracy a chance!

Categories
General

Why Do I Blog?

Steve Hodson wrote a fun post today entitled “So You Want To Be A Rich And Famous Blogger Eh” in which he tries to classify bloggers who write in order to be read beyond their immediate family and friends. I often forget that most people who blog aim to make money from it–an aim in which I suspect few people succeed. Most writers didn’t make much money (if any!) before there were blogs, and blogs didn’t change the basic rules of attention economics.

If I read Hodson correctly, I’m a Louis Gray kind of blogger: my only “economic” gain from blogging is reputation capital. But my real motivation is that it’s fun. The blogosphere is the Usenet of my school years, all grown up. It also provides a way to share ideas in a far more immediate and permissive forum than peer-reviewed publications. Who wouldn’t want to be a blogger?

What blogging has done is dramatically lowered the cost of publication and the efficiency of reader feedback–the latter including readership statistics and comments. Sure, readers don’t pay money to subscribe to blogs like they (used to) pay for print media, but the scarce quantity has always been attention rather than money. As far back as I can remember, writers write to be read and generally count themselves lucky if they can translate readership into income. Personally, I am fortunate to have a great day job!

So readers, have no fear, you’ll never see ads here. The Noisy Channel is a labor of love.

Categories
General

Information Seeking for the Political Process

Yesterday, I participated in a discussion about technology issues facing the next United States administration. The New York CTO Club, which is non-partisan, invited both presidential campaigns to participate, but unfortunately we only had representation from one of them. Still, it was an earnest, informed discussion that excited me despite my deep skepticism about the political process.

One of the issues we discussed was the challenge of communication to inform policy, whether from government to the population at large or vice versa. In particular, we discussed the problem of distilling countless emails to ensure that politicians, whose time is extremely scarce, are aware of the best ideas coming from citizen activists.

The conversation could have been about search and relevance ranking. Concerns ranged from managing near-duplicate documents (people often copy and paste letters from organizations) to anonymous authorship and reputation systems. Indeed, the issue of communicating about policy amounts to a collection of information seeking problems for politicians, non-political staffers, activists, and the general population.

I was happy to see some other folks agreeing that any relevance measure would be suspect, given the adversarial nature of the political process. What may be good enough for casual web search is surely inadequate when policy decisions are at stake. I see the implication as a need for transparent information seeking support systems that offer users control and guidance. Moreover, what is good for policy-driven information seeking seems broadly applicable to information seeking in general.

To be clear, any improvements to our current process of communicating between government and citizenry would be welcome. But we should not cut corners in our aspirations.

Categories
General

Google Blog Search: Not Different Enough

A couple of weeks ago, I was commenting about a recent position paper by Marti Hearst, Matt Hurst, and Sue Dumais and asking whether blog search was fundamentally different from other information seeking tasks on the web.

Well, I read today on ReadWriteWeb that Google launched a new home page for blog search. Of course, I tried it immediately. Indeed, the home page was enticing, a portal style reminiscent of their news home page. But then I quickly realized that all they’d really done was copy their news design and applied it to blogs. Once you search, you’re back to what is essentially a ranked list of results.

It’s a clean, well-engineered implementation, but I was hoping for something different. I know that Google isn’t big on innovative search interfaces, but I had somehow imagined that they’d recognize that blog search really calls for innovation. So do news search and web search, but blogs, as commenters here have pointed out, make the clearest case.

Oh well, an opporunity wasted for Google, an opportunity preserved for someone else. Faceted, exploratory blog search, anyone?

Categories
General

This Conversation Will Be Recorded

Much as traditional journalism has given way to a world where everyone can be a publisher, traditional journalistic notions of “off the record” conversations have given way to a norm of unbridled exposure. As Mark Evans writes in “Is Anything Off the Record?” that “everything you say/write is public, even casual conversations over a coffee, is on the record.”

We have finally realized a perfect storm where anything can be published and everything can be found. Privacy through difficulty has given way to unintentional broadcasting.

For those of you who think this vision is melodramatic, let me share the following examples from recent personal experience:

  • Someone from a company that competes with my employer wrote me an email that, published verbatim, might have embarrassed his employer. How was he to know I would not publish it?
  • A professional group in which I participate had a heated discussion about whether it was appropriate to blog about topics discussed at our meetings and on our mailing list. We ultimately concluded that all of our discussions should be considered off the record.
  • Someone attending an Endeca sales presentation made a less than favorable comment about it on Twitter. It showed up on my RSS feed, and I reached out to him, only to find that he had liked the presentation on the whole and had simply been posting in a bad-tempered moment.
  • I removed my “relationship status” from my Facebook profile in order to protect what little was left of my privacy–only to be inundated within minutes by concerned colleagues who received a message that my status was no longer “in a relationship.” Needless to say, their concern was unfounded. Clay Shirky tells a more dramatic story along the same lines in his recent Web 2.0 keynote on filter failure.

What does this all mean? I think we need to get used to the more efficient flow of information. The propagation is neither total nor instantaneous, but it is still sufficient to overturn many assumptions that held only a few years ago. It will be fascinating to see how our social norms adapt to this new reality.

Categories
General

PageRank: Get Over It

John Battelle posted a plea to Google today to increase the granularity of PageRank, which he calls “the unofficial, and official, and semi-official, arbiter of value on the web.” Alternatively, he proposes that Google “just go dark and don’t tell us anything.”

I have an another suggestion: stop worrying about PageRank. It’s not even clear how much Google cares about it anymore. Recently this blog’s PageRank has been even more volatile than the stock market, going up from 0 to 5 and back down to 3 over a period of a few weeks. I haven’t seen any correlation between PageRank and traffic, and I find that this blog often appears in the top search results for appropriate queries. It seems to me that PageRank only matters to the ego of the page’s author–and surely we can find other ways to inflate our egos.

I know that competition is human nature, and I’ll confess to grade grubbing in my early school years. But reducing all pages to a static authority score is literally one-dimensional. Even Google seems to have de-emphasized static authority in favor of query-dependent relevance.

A decade ago, PageRank was a revolutionary measure, a secret weapon against spammers who were gaming the traditional information retrieval measures that most search engines used to rank results. But the past ten years have reminded us that relevance has many facets. PageRank is still valuable as a static indicator. But, if anything, we should ask for it to be coarser-grained rather than finer-grained, since static authority is more useful a spam filter than as a total ordering on the billions of web pages.

In any case, it’s unseemly to grub for grades. Let’s show some dignity.

Categories
General

Not Marching to Google’s Vision

I try to avoid partisan posts about my employer on this blog, but a recent blog post I read was so out of line that I feel the need to respond here personally.

In a post entitled “The Future of Search is Simpler“, the Enterprise Search blog states that

Google provided a clear vision: you can be up and operational in one day and search everything….While other companies are marching to that vision. Dieselpoint’s OpenPipeline, Endeca’s simple administrative controls, Fast’s navigators, Autonomy’s categorization, Google is providing the vision.

I’ll let other companies speak for themselves, but I can state with certainty that Endeca is not marching to Google’s vision. As I’ve discussed here repeatedly, enterprise search is not a problem that can be solved by just plugging a box into your intranet.

Google is entitled to its vision of enterprise search, and I wish them the best of luck in their efforts. But please don’t accuse Endeca of following that vision.

Categories
General

Periodic Table of Visualization Methods

An oldie but goodie that one of my colleagues at Endeca just reminded me of is the Periodic Table of Visualization Methods by Ralph Lengler and Martin Eppler at Visual-Literacy.org.

Click through the picture to the real page, which shows you an example each visualization when you hover over it in the table.

Interestingly, tag clouds don’t show up in the table? Is this because the authors hate them, or that tag clouds aren’t sufficiently graphical to qualify as a visualization? I know many people who despise tag clouds–and I used to be one of them. But I think they have their place, and I’ll return to the subject some time soon.

Categories
General

Vint Cerf on the next Internet

Vint Cerf published an Official Google Blog post about the next Internet in which he predicts that “that mobile devices will become a major component of the Internet,” that “video will become an interactive medium in which the choice of content and advertising will be under consumer control,” and “a box of washing machine soap will become part of a service as Internet-enabled washing machines are managed by Web-based services that can configure and activate your washing machine.” And, he concludes, “Google will be there, helping to make sense of it all, helping to organize and make everything accessible and useful.”

I hesitate to accuse someone as accomplished as Vint Cerf of lacking imagination, but I found his post uninspiring. As he should know better than most, mobile devices are already a major component of the Internet. Greater control of video would be nice, but how about greater control of the the world’s information that Google aspires to organize?

Perhaps posting on the Official Google Blog has a dampening effect. I’d like to believe that readers here at The Noisy Channel have wild imaginations and are uninhibited about exercising their creativity. Please use this space to do so!

After all, I do agree with Cerf about Alan Kay’s observation that the best way to predict the future is to invent it. Or, to quote him in full:

“Don’t worry about what anybody else is going to do… The best way to predict the future is to invent it. Really smart people with reasonable funding can do just about anything that doesn’t violate too many of Newton’s Laws!”