Thanks to Greg Linden for alerting me to a talk by Google Fellow Jeff Dean on Research Challenges Inspired by Large-Scale Computing at Google. Click through the link to stream or download the talk in your favorite media format.
Month: October 2008
Today’s news: newspapers’ web revenue is stalling.
No wonder: Google is mixing up search, advertising, and publication, while newspapers are responding to this competitive pressure by sliding down the slippery slope into becoming aggregators.
This is a tough game, and I’m not sure how it plays out. I understand how media players fear Google commodifying their content, but I don’t think the best strategy for them is to accelerate this process.
On one hand, the increasing dependence on Google for traffic degrades brand loyalty, since it leads to hit-and-run users. On the other hand, Indeed, the increasing dependence on ad networks means that, as Paul Iaffaldano of the TWC Media Solutions Group suggests, “the publishers commoditize their own inventory”.
I use Google News and Techmeme to get an overview of general news and technology headlines, but I am still loyal to several sites and feeds. I’m probably more of a news junkie than most. Even so, if publishers sacrifice their differentation as a short-term survival tactic, they will ultimately lose everything.
It’s a bit late, but I think publishers need to figure out how to renegotiate their balance of power with aggregators and even search engines. If economics of publishing on the web reduce to an SEO war, it ain’t gonna be pretty.
The Link Economy goes Mainstream
I just read an article in the New York Times by Brian Stelter describing how mainstream news outlets like NBC and the New York Times itself are starting to link to other sites . This is a pretty radical change, since these sites have historically aimed to by sticky and thus maximize their customer exposure to their content and their ads.
The article quotes Scott Karp, chief of the Web-based newswire Publish2, justifing this “link journalism” approach by relating it to Google’s success: “It’s all about sending people away, and it does such a good job of it that people keep coming back for more.”
Blogger Jeff Jarvis (who is also involved with the Daylife news aggregator) offers a “golden rule” of links: “Link unto others’ good stuff as you would have them link unto your good stuff.”
As a blogger, I find a lot to agree with in the above. But I’m operating a niche site aimed at a highly targeted audience. And, while I aspire to have hoardes of readers, I am not counting on them for my likelihood. I’m not even monetizing my readership by selling their attention to advertisers!
But I’m not sure how well this approach will work for broad media outlets. As the article states, these news organziations are acting in effect like aggregators. So much for “content is king”. I exaggerate–I assume that none of these media companies are planning to dump their own content and reduce themselves to branded aggregators. Still, it is a slippery slope, and it’s hard to resist the lure of free content.
I’m curious to see where this all goes. As a user, I’ve moved from media loyalty (I grew up receiving the New York Times on my doorstep) to using media commodifying aggregators (Google News) to pulling together RSS feeds into my own reader. I suppose most people lack the patience, inclination, or technical sophistication to put together personalized newsfeeds. Still, I’m not convinced that it’s a good idea for media players who have valuable content to turn themselves into aggregators.
Rather, I think they should follow the advice of Dan Farber, vice-president of editorial at CNET Networks and editor in chief of ZDNet:
At CNET we link to our stories and to others. Generally if it is a standard news item that everyone has, we link to our version. If someone has the seed of a story or a take that helps to carry a story forward or deeper, we link to whatever. A challenge for all of us is finding and linking to content that we should point our readers at…often we don’t have the time to go figure who has the best take or where a story came from before it got refactored by the blogosphere…so we continue to improve on it every day.
I think this advice confirms Jarvis’s “golden rule”, but doesn’t go as far as Karp’s “link journalism”. If you are a media outlet, you should send your readers away if you don’t have what they want. But you should try to do a good job of having what your readers want. After all, you are a media outlet, not an aggregator or search engine.
In response to my various calls to action here at The Noisy Channel, I’ve gotten a fair number of requests from readers asking me for specifics on how they can help. I’d like to offer some concrete suggestions. I’m hoping to make them bite-sized enough that we can make a task queue that volunteers will pick up.
Proposed projects:
- Add a History section to the Enterprise Search entry. Some suggested sources:
– “Challenges in Enterprise Search” by David Hawking
– “Enterprise Search: Tough Stuff” by Rajat Mukherjee and Jianchang Mao
– Enterprise Search Sourcebook 2008
– The list of Enterprise Search Vendors (but please don’t revert this page back to a vendor list!)
– Enterprise Track at TREC - Add a “Definition” section to the Enterprise Search entry that subsumes the current entry and explains the various competing definitions of “enterprise search”. Posts on this blog that mention enterprise search, as well as the material they reference, would be a good starting point. This task might also include eliminating the Enterprise Information Access stub and pointing it to the Enterprise Search entry.
- Create a Faceted Search entry. Amazingly, Wikipedia doesn’t have one, and the Faceted Classification entry is not (and should not be) a substitute. Marti Hearst’s HCIR ’08 paper would be a great starting point.
- Go through the Information Retrieval category and propose at least incremental steps to organize the 91 entries in it. For example, should vendors be on this list? Open source software packages? I don’t know what is standard for a Wikipedia category, but it is ironic that the Information Retrieval category is so chaotic!
I also encourage people here to add to this list, though I suggest to those same people to consider contributing more than just work for others. And, to be clear, some of the entries in this category are excellent, e.g., the entry on stemming. We should aspire to raise all of the entries to this level, and at least to promote high-quality entries so that they are not buried by their lower-quality brethren.
Visualizing Political Bias
Just saw a post from waxy.org by Andy Baio entitled “Memeorandum Colors: Visualizing Political Bias with Greasemonkey“. Here’s a quick excerpt:
With the help of del.icio.us founder Joshua Schachter, we used a recommendation algorithm to score every blog on Memeorandum based on their linking activity in the last three months. Then I wrote a Greasemonkey script to pull that information out of Google Spreadsheets, and colorize Memeorandum on-the-fly. Left-leaning blogs are blue and right-leaning blogs are red, with darker colors representing strong biases.
To install it, you’ll need Firefox (not a problem for 61% of you, according to my analytics) and optionally the Greasemonkey extension:
- Greasemonkey users: memeorandum_colors.user.js
- Standalone Firefox Extension: memeorandumcolors.xpi
After it’s installed, go to any page on Memeorandum and wait a second for the coloring to appear. For details of how they used Singular Value Decomposition (SVD) to score the blogs, check out the post.
It’s a nice application, and it reminds me of a 2004 CIKM paper by Miles Efron entitled “The Liberal Media and Right-Wing Conspiracies: Using Cocitation Information to Estimate Political Orientation in Web Documents“.
Twitter’s Twist on the Attention Economy
I am a long-time LinkedIn user, and over time I’ve accumulated over 1,000 connections. Most of them are people I actually know or at least have interacted with online beyond “connecting”.
You might think that’s a large number of people to have as connections, and that I could afford to have a more selective velvet rope. And, as you may have noted, I know only most of my connections; some of them are link spammers whose connection requests I nonetheless accepted.
But, you see, there’s no incentive for an individual to reject a spammy connection request. Link spammers do reduce the relative value of legitimate links, and as a result devalue the LinkedIn network as a whole. But it’s a classic tragedy of the commons. Why should I personally sacrifice the reach of my network if I gain nothing? As far as I can tell, this problem applies just as much to Facebook and other social networking platforms.
Twitter is a different beast. Granted, Twitter and LinkedIn may not even see each other as competitors, but that is beside the point. They are competing for people’s social networking cycles, and all of today’s social networking platforms / applications are surely keeping their options open as to what positions they will ultimately stake out.
In any case, what most differentiates Twitter from LinkedIn is their attention economics. On LinkedIn, you incur a benefit–at no apparent cost–from the size of your network, up to degree 3. In contrast, all that matters in the Twitter “social graph” are your immediate links. You don’t get any direct benefit from connections at distance greater than 1. Moreover, the connections are asymmetric, as are their costs and benefits. Following people is an investment of your attention, where the return is access to information (in a broad sense). Being followed is an investment of their attention, and hence an opportunity to exert influence. The asymmetry of Twitter connections is most evident for celebrity influencers, who have far more followers than followees.
While Twitter, at least in my view, is a work in progress, I think they have done well to align their model with attention scarcity. I’m most keenly aware of this scarcity as I decide whom to follow. Accepting a connection from a LinkedIn spammer costs me nothing, while following someone on Twitter who updates on every inhale and exhale would render the service completely worthless.
As a result, connections in Twitter reflect real value. They correspond to investments of attention. Someone with many followers is much like an author with many readers. While I’m sure this metric can be gamed (e.g., by creating bogus Twitter accounts and having them follow you), at least Twitter has the model right in principle.
Speaking of which, if you’re interested in following my tweets, you can find them here.
People Ask Lousy Questions
I just saw Michelle Manafy’s notes in EContent about the recent Enterprise Search Summit West.
A great quote from IDC analyst Sue Feldman: “One of the problems we have with search is that people ask such lousy questions…anytime tools hand people clues, it helps.” Sue has been pushing conversational interfaces for a while, and I agree with her that, as an industry, we need to keep working on the tools to support query elaboration and interaction in general.
I do take issue with Stephen Arnold’s advice at the same conference to vendors to get on the Google-enhancement gravy train and “build solutions that sit on top of Google and make it work better.” Dare I say that the writer of Beyond Search is being a bit reactive?
Search is Not Advertising
Thanks to Greg Linden (who in turn thanks John Battelle) for calling my attention to a post by Google VP of Product Management Susan Wojcicki entitled “Ad Perfect“.
We can distill Wojcicki’s post to three principles, each a direct quote:
- “advertising should deliver the right information to the right person at the right time”
- “help you learn about something you didn’t know you wanted”
- “it needs to be very easy and quick for anyone to create good ads, to show them only to people for whom they are useful, and to measure how effective they are”
While Wojcicki does call out the similarity between Google’s mission in advertising and its mission in search, she fails to see a key difference–a difference exposes a fundamental problem with web search today.
Search is all about the user. If you can help me, the user, find what I’m looking for, or to find something I didn’t know I wanted, then I’m all ears (or eyes). Of course, I’d like to understand your motives if you’re offering to help me make decisions, especially if they involve my money or even my health.
Advertising is about selling the user’s attention to the highest bidder. Google has done more than anyone to make that bidding process economically efficient. But any utility that advertising proves to users is a means to an end. Advertising is all about the advertisers, and the advertisers only care about providing value to users in so far as their interests are aligned. Absent alignment, advertisers naturally look out for themselves.
This dynamic is hardly unique to search; it applies to any situation where we allow someone or something to influence our decisions. Indeed, persuasion and critical thinking have been locked in an arms race for millenia. The use of advertising to subsidize content dates back to the early 1800s. Wikipedia offers a nice history of the subject.
But supporting search through advertising is a tricky business. Google insists that it maintains a wall between its search and advertising businesses. But Wojcicki’s post–which is on Google’s official blog–suggests otherwise, at least in spirit. If Google believes that both search and advertising aim to “offer relevant content” and “deliver the right information to the right person at the right time”, then why put up a wall at all?
In any case, it is at best misguided and at worst intellectually dishonest to claim that the main goal of advertising is to inform or help the user. The goal of advertising is to influence the user, a goal whose achievement requires delivering a message to which the user is receptive. But influencing is not the same as informing. I hope we all have the critical thinking skills to appreciate the difference.
Sales Pitch for the Semantic Web
Thanks to Marco Neumann, who runs the New York Semantic Web Meetup, for alerting me to this presentation by Nova Spivack, whom Marco aptly describes as Chief Director of Sales of the Semantic Web. Enjoy!
http://vimeo.com/moogaloop.swf?clip_id=1062481&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1
Nova Spivack at The Next Web Conference 2008
For the benefit of readers using RSS, I just wanted to point people to great discussion going on in the comment thread for this post.