Categories
General

Curt Monash Analyzes the Text Analytics Market

Curt Monash recently shared his views of the text analytics market through his blog and a slide presentation that he’s made available online. The presentation is refreshingly hype-free, and I recommend you take a look.

His observations about the web search market are spot-on: the current attention is on transactional queries (see Andrei Broder‘s classic paper on the taxonomy of web search for an explanation of navigational, informational, and transactional queries), and web search generally is dominated by the dynamics of adversarial information retrieval. Depressing (to me, at least), but accurate. He does see potential future with better interfaces, but he asks, “how good does the technology have to get before people care?” My sentiments exactly.

On to the enterprise market, which is more interesting. Here it’s harder to summarize Monash’s thoughts, except to say that he sees the current landscape of enterprise search offerings as hopelessly confused.

Monash divides the enterprise market into public-facing site search, which he further divides between e-commerce and “general”; and “true” enterprise search which seems to mostly denote intranet search; and custom publishing. While I’m not entirely comfortable with his taxonomy of the space, I do give him credit for laying one out.

He then goes on to explain how “one-size-fits-all” approaches have failed and how the enterprise search market landscape is “bollixed”. He lists a number of technical challenges, all of which I agree with.

But I’d add one: the need for content enrichment techniques and interfaces that support interaction, exploration, and discovery. Yes, we’ve seen these as buzzwords in vendor hype, but that doesn’t make them any less real. There’s been too much emphasis on best-first, known-item search, and not enough on the other use cases that comprise enterprise search and information access.

I think that exploratory search will eventually be important for web search too, but the complacency with current approaches kills any sense of urgency. There is no imminent threat to Google’s reign.

In the enterprise search market, however, there is a justified dissatisfaction with the status quo. And, in my belief and experience, that is because too many people (vendors and enterprises) are trying to treat the enterprise like a microcosm of the web, where the only major differences are the connectors to acquire content and the ranking algorithm to sort results. Getting these right is necessary but not sufficient. Interaction, exploration, and discovery–in short, HCIR–are not just nice-to-have features, but rather are essential to making search work in the enterprise.

Categories
General

Control of attention is the ultimate individual power

I must have been asleep at the RSS reader a couple of weeks ago, because I missed this gem in  “Lost in the Crowd“, David Brooks’s review of Malcolm Gladwell’s new book Outliers in the New York Times:

Control of attention is the ultimate individual power. People who can do that are not prisoners of the stimuli around them. They can choose from the patterns in the world and lengthen their time horizons. This individual power leads to others. It leads to self-control, the ability to formulate strategies in order to resist impulses. If forced to choose, we would all rather our children be poor with self-control than rich without it.

Fortunately, Mike Elgan called it to my attention in a column entitled “Work Ethic 2.0: Attention Control“. And, like me, Elgan reacts much more strongly to Brooks’s comment about control of attention than to anything he actually says about Gladwell’s book.

But  Elgan’s concern is with what he sees as the “distraction virus” of the internet in general, and of social media in particular. I don’t dispute his observations, but I have a different take on Brooks’s point.

It’s true that self-control gives us individual power, and we knew all this long before we had to contend with the online demands on our attention. For example, Herman Hesse wrote in Siddhartha:

“I can think. I can wait. I can fast.”

“That’s everything?”

“I believe, that’s everything!”

What’s new is that attention is becoming a currency to rival the tangible goods we’ve usually thought of as scarce and valuable. But, as Brooks and Elgan note, it differs from money in its far more subjective nature.

A growing economy revolving around something money can’t buy, and which is subject to the vicissitudes of individual control. That certainly makes things interesting.

Categories
General

It’s the Attention, Stupid

One of the mistakes we often make in our quest for economic reductionism is to assume that all value can be monentized. But this model often breaks down in the context of online communities. As Manila Austin, a psychologist who heads up research at Communispace, put it, “People want the validation that they are being heard.”

In a BusinessWeek article entitled “Will Work for Praise: The Web’s Free-Labor Economy“, Stephen Baker tells some stories about the unpaid volunteers who invest their energy in helping others online, and the companies who try to monetize their efforts. Of course, the volunteers are only unpaid in financial terms; they are very much incented by one thing money can’t always buy: attention.

These days, there’s a lot of concern about what business models will sustain social media–particularly blogs and Twitter. It’s clear from stories like Baker’s that many participants in online communities are sufficiently motivated to invest their own time–and possibly even their own money–in order to reap the non-financial reward of attention.

As it is, most bloggers and tweeters are unpaid for their efforts. Perhaps this model will ultimately sustain the blogosphere, and attention will trump money as the currency of communication.

Categories
General

Selling Out: It Sells

A few days ago, Jon Pareles wrote a New York Times article entitled “Songs From the Heart of a Marketing Plan“, describing the increased licensing of new music for commercials, video games and soundtracks.

He comments:

Selling recordings to consumers as inexpensive artworks to be appreciated for their own sake is a much-diminished enterprise now that free copies multiply across the Web.

While people still love music enough to track it down, collect it, argue over it and judge their Facebook friends by it, many see no reason to pay for it. The emerging practical solution is to let music sell something else: a concert, a T-shirt, Web-site pop-up ads or a brand.

Licensing music to marketers is hardly new, but there is an urgent pressure from the decline of CD sales and the industry’s inability to make up the difference in digital music sales. The music industry isn’t about to go down without a fight, and licensing music to marketers is immune to digital piracy.

Rather, it ties the fate of the music industry, at least in part, to that of the advertising industry. While no industry is recession-proof, advertising seems better placed than most.

But consider the irony. Historically, people either have bought music or have listened to it for free on ad-supported media like radio. Is the future one where music is distributed for free and embedded in the very ads that historically subsidized its distribution?

Moreover, if this change is successful, will we see it extend to other digital media–like movies, books, or even news? Could the entire world of publication revolve around advertising?

I hope this is just a paranoid fantasy. It wasn’t that long ago that we were proclaiming “content is king”. Perhaps Andrew Odlyzko was right to challenge this assertion in his article “Content is not king“–though he argued the primacy of communication over content. I just hope that communication does not devolve to advertising.

Categories
General

Loic Le Meur Misses the Point of Twitter

Loic Le Meur wrote a post today arguing that we need search by authority for Twitter.

His argument:

Comments about your brand or yourself coming from @techcrunch with 36000 followers are not equal than someone with 100 followers. Most people use Twitter with a few friends, but when someone who has thousands, if not tens of thousands of followers starts to speak, you have to pay attention.

I think he’s missing the point of Twitter, or perhaps viewing Twitter narrowly through the lens of a viral marketing evangelist. Twitter is a communication platform, not a marketing platform, and there’s a subtle difference. Much as I wouldn’t want my email or phone prioritizing people based on their number of friends, I wouldn’t want Twitter to apply some global “authority” filter when I’m perfectly capable of deciding whom I want to listen to.

It’s easy to speculate that Le Meur’s argument is self-serving, since he has over 15,000 followers. He also follows over 15,000 people, which shows how little value he actually places on following someone (unless he’s the world’s fastest speed reader).

I don’t dismiss his idea entirely; I can see some value in getting an aggregate view of online punditry. In fact, I’m responding to his argument myself, precisely because his opinion carries weight in the online community and deserves a rebuttal.

But I suspect that Twitter, with its design for immediate, personal communication, isn’t the best vehicle for assembling this view. Note that I’m responding by blogging, not by tweeting.

Categories
General

Not By Links Alone

 
Dan Farber recently shared this observation about the future of journalism:

While the Internet is growing as the place where people go for news, the revenue simply isn’t catching up fast enough. The less obvious part of the Internet overtaking newspapers as the main source for national and international news is that much of the seed content–the original reporting that breaks national and international news and is subsequently refactored by legions of bloggers–comes from the reporters and editors working at the financially strapped newspapers and national and local television outlets.

Matt Asay, wondering whether we’re headed towards a model that looks like “More front page, op-ed, and nothing in between?“, sums it up eloquently:

blogging helps to destroy the business models powering its original source material

I abhor waste, and I’m always amazed that, a decade into the mainstream use of the web, we still have so much inefficiency in the duplication of content.

In retail, there is still a surprisingly high variance in the pricing of the same product among competing sellers, even though price comparison services have been available for years.

In news, much of the content is syndicated from a handful of wire services. Perhaps that commodification of content is part of the malaise in the news industry, but I doubt it; after all, much of the commodification predates the growth in online news. Rather, the problem seems to be that the gains from online advertising revenue aren’t compensating for the offline losses.

I would love to see a world in which original contributions of all sorts are highly valued and rewarded. We see the profit from innovation in physical goods, most notably from Apple’s success in consumer goods. But digital content is different, and I worry about the tension between the high cost of producing it and the low cost of reproducing it.

I spend more time reading blogs than reading news, but I realize that bloggers, myself included, assume an ecosystem in which old-school news organizations do much of the heavy lifting. I play by the rules of fair use and the link economy, giving credit to my sources and linking to them.

But is that enough? Are we slowly nibbling on the hand that feeds us? Is is reasonable to expect journalists, as Jeff Jarvis seems to suggest, to live by links alone? As the title of this post indicates, I don’t think so, but I wish I could offer more constructive suggestions.

Categories
General

Putting the Social back in Social Networks

Merry Christmas / and Happy Newton Day to all! I hope all of you are spending some time offline for the holidays.

I couldn’t kick my daily blogging habit, especially after I saw an article in the Wall Street Journal about the dreadful controversy of unfriending people on social networks:

Now, people who have accumulated hundreds, or in some cases more than a thousand, friends are cutting loose some of the ones they have lost touch with or who were little more than acquaintances from the start. It’s a shift from the days when users, eager to boast about their online popularity, added new friends with abandon, whether or not they really knew them.

Even Michael Arrington has chimed in with a post about the meaning of friendship. It’s one of his more soberly written pieces; perhaps the holiday spirit is getting to him. His argument in a nutshell:

It’s clear that the more friends you have on any given service, the more noise you have to wade through to find the golden signal. In the real world when you don’t want to be friends with someone, you just find ways not to spend time with them. But online, you click that friend button because it seems so easy, and it’s considered insulting if you don’t. And then you pay.

When I was a child, I remember the importance placed on the notion of a “best friend”. The key, of course, was scarcity. You could only have one best friend, and public declaration of who was your best friend enforced this constraint.

If online social networks are going to claim the same validity as their offline counterparts, they need to reflect the real-world scarcity of attention. Otherwise, the notion of an online social connection becomes a sham.

For example, we know that no one can possibly maintain thousands of meaningful social relationships. Hence, if you are one among the thousands of people that someone is following on Twitter, then you should assume that your relationship with that person isn’t worth the bits its printed on.

Hopefully we’re smart enough as human beings to figure this out. But it would be nice for the online social networks to actually reflect attention scarcity constraints. Then we might be able to leverage them to build far more useful applications.

On that note, I’m going offline to spend the day with my most important connections.

Categories
Uncategorized

Blogs I Read: Peter Turney’s Apperceptual

The other day, Daniel Lemire posted a comment extolling Peter Turney as someone who does a great job blogging about his research. His blog, Apperceptual, is one of the highest-quality blogs I’ve seen in the information retrieval community.

Turney is a Research Officer at Canada’s National Research Council (NRC) Institute for Information Technology. His two decades of research cover a broad spectrum of topics in machine learning, information retrieval, and computational linguistics. Moreover, the practial orientation of the NRC helps ensure that Peter’s scholarly work is grounded in the real-world.

The best way to get a feeling for Turney’s blog is to read it. Here are a few posts I’d suggest:

This last post, published today, offers a promising approach towards establishing analogies as the central problem in a theory of semantics. Or, as Turney quotes Douglas Hofstadter, that “all meaning comes from analogies”.

Turney’s writing isn’t always so heavy. In fact, two of his most popular posts are “Open Problems” and “How to Maximize Citations“, both of which I’d recommend to aspiring researchers.

Turney doesn’t crank out blog posts daily or even weekly–he sometimes goes for over a month between posts. But what he does write is well worth reading.

Categories
Uncategorized

The Future of Measurement

Over the past few days, Kate Niederhoffer put together a collection of thoughts about the future of measurement in social media. Contributors include:

I enjoyed being part of the collective writing process, and I hope you enjoy reading the results.

Categories
Uncategorized

If you can’t stand the links, get off the web

I don’t always agree with Jeff Jarvis, but he nailed it in “A danger to journalism“, a post in which he discusses the “GateHouseGate” controversy: 

GateHouse has sued The New York Times Co., arguing that the Boston Globe’s new YourTown hyperlocal site for Newton is violating copyright laws by copying headlines and first sentences verbatim from GateHouse sites in Massachusetts and–horrors!–linking to the stories themselves on GateHouse’s pages.

As Jarvis put it, “If you can’t stand the links, Gatehouse, get off the web.” I am sympathetic to authors whose work is being unfairly used, as I discussed in my recent post on fair use and SEO. But suing people for copying two sentences and linking? I thought we were past that by now.