Categories
Uncategorized

The Noisy Channel: Now Better Than Sex!

Well, I’ll admit the evidence is a bit shaky. But an online survey commissioned by Intel reports that about half of women and a third of men would rather go without sex for two weeks than give up the Internet for that long. I’m not quite sure what to make of the survey, or the premise that the survey sought to prove “how essential the Internet has become to people–even during tough economic times.”

All I can say is that, if you are spending your time on the Internet, I hope you are enjoying The Noisy Channel.

Categories
Uncategorized

Upgraded to WordPress 2.7

Just wanted to let readers know that I’ve upgraded to the latest version of WordPress, 2.7 (“Coltrane”). Please let me know if you experience any technical difficulties.

Categories
Uncategorized

Freemium is the new black

An article by Claire Cain Miller in today’s New York Time proclaims: “A Web Start-Up Counting on Ad Sales? Good Luck“. The article isn’t kind to the ad-supported model in general, but the particular concern is for startups. The article quotes David Weiden from Kholsa Ventures:

“The ad model is somewhat worse but not radically worse,” he said. “What’s worse is getting funded that way.” If a company approaches investors with a plan to lose money for three or four years while building an audience, it will encounter many closed doors, he said. “It’s gone from plausible to almost implausible.”

The preferred alternative is the “freemium” model: offer basic services for free, and upsell advanced or special features. I marvel how it’s controversial to talk about charging for services. But that’s what happens in a world where people have come to expect information to be free–an expectation that the current economic conditions will surely reinforce.

Categories
Uncategorized

Why the SEO stakes are so high

According to an article published today in IT Business Canada:

The typical Web site gets 61 per cent of its traffic from organic (nonpaid) search engine results, and 41 per cent of all traffic from Google alone.

In part, these numbers reflect Google’s dominance in web search–41/61 is a whopping 67%, which is within epsilon of Google’s reported share of the web search market. But the larger point is that most sites depend on web search for the majority of their traffic, which makes search engine optimization (SEO) a matter of life or death for commercial sites in general, but especially online retailers, publishers.

So it’s not surprising that SEO is a multi-billion dollar industry, comparable in size with the pay-per-click (PPC) advertising industry. And, to the extent that the SEO industry is helping to organize the world’s information, it’s earning its keep. But it’s hard to know how much SEO improves the efficiency of the information market vs. how much simply fuels an arms race. Again, full disclosure: I am one of the arms dealers.

Categories
Uncategorized

Noisy Channel, Back on Manual

As the Captain says in WALL-E, “AUTO, you are relieved of duty!” It’s good to be back in the blogger’s seat, so stay tuned for fresh content coming up this week.

Categories
Uncategorized

Humans and Machines: Collaborators or Competitors?

Last week, Hal Daume wrote a nice post entitled “Supplanting vs Augmenting Human Language Capabilities“. Drawing an analogy between natural language processing (NLP) and robotics, he says:

I would say that most NLP research aims to supplant humans. Machine translation puts translators out of work. Summarization puts summarizers out of work (though there aren’t as many of these). Information extraction puts (one form of) information analysts out of work. Parsing puts, well… hrm…

There seems actually to be quite little in the way of trying to augment human capabilities.

He then offers possible ways that NLP might be used to augment, rather than supplant human capabilities:

  • Tools for language learning.
  • Interactive information retrieval.
  • Adaptive tutorials.

The main tenet of HCIR is that information retrieval systems should be working with users, rather than trying to do all of the work on their own. It’s great to see a kindred spirit thinking about machine learning and NLP in the same light.

Categories
Uncategorized

If You Like The Noisy Channel, …

Already missing your Noisy Channel fix? Why don’t you check out some of the blogs I read:

Categories
Uncategorized

Going on Auto-Pilot

I’m spending a week in Akumal without network connectivity. Yes, a real family vacation. No working, no blogging, no reading Techmeme.

But have no fear. I’ve scheduled daily posts in my absence. The Noisy Channel will not go silent! Obviously I won’t be able to participate in the comment threads, and I can only hope that the evil comment spammers won’t use this opportune moment to attack. Meanwhile, I urge you all to take this opportunity to have the last word–at least until I get back!

If you do need to contact the authorities in my absence, I suggest you send a message to Claude Shannon.

Categories
Uncategorized

SearchWiki: A Platform for Steganography?

Lauren Weinstein wrote an interesting post today, suggestng that Google’s new SearchWiki feature “provides an interesting platform for the global distribution of secret messages“.

This practice, known as steganography, has been a concern for centuries, but most recently has come up in the context of alleged use by terrorists.

No, I don’t think Google is trying to be evil. Moreover, there are lots of other ways to broadcast steganographically encrypted messages on the web, such as posting comments on unmoderated blogs. But it’s interesting that this is the first “useful” application I’ve seen proposed for SearchWiki.

Categories
Uncategorized

Semantic Search Wikipedia Entry: Needs Help

I haven’t written a community post in a while, but I thought that, with everyone getting into the Thanksgiving spirit, perhaps someone might be inspired to give to a Wikipedia entry in need. I’m talking about the semantic search entry, which–as the talk page notes–needs work.

As I told Ron Miller in my recent one-on-one with him:

Semantic search means different things to different people, but broadly falls into two categories: Using linguistic and statistical approaches to derive meaning from unstructured text, using semantic web approaches to represent meaning in content and query structure.

Perhaps someone here could reorganize the Wikipedia entry along these lines?

Or, if you don’t feel sufficiently expert on semantic search to rework the content, perhaps you could help despam the entry, following the example of what I did for the enterprise search entry. I moved the vendors to a separate entry and culled vendors that didn’t have their own Wikipedia entries (which is the accepted “notability” standard).

I know that editing Wikipedia entries is a thankless job. But someone has to do it. And if folks like us don’t then these pages often are overrun by spammers. Think of this as a small contribution to global knowledge management. At the very least, you’ll have my thanks.