The Noisy Channel


The Efficiency of Social Tagging

April 23rd, 2008 · 3 Comments · General

Credit to Kevin Duh by way of the natural language processing blog for highlighting recent work from PARC on understanding the efficiency of social tagging systems using information theory. The authors apply information theory to establish a framework for measuring the efficiency social tagging systems, and then empirically observe that the efficiency of tagging on has been decreasing over time. They conclude by suggesting that current tagging interfaces may be at fault, through a positive feedback process of encouraging popular tags.

After seeing this and the TagMaps work at Yahoo Research Berkeley, I feel that the IR and HCI communities should join forces to understand social tagging in general terms that relate information, knowledge representation, and human beings. These concerns are hardly specific to the web or to what is now called “social media”–after all, media is social by definition. Indeed, there is no reason to confine this approach to human-tagged collections–why not consider automated tagging systems on the same playing field?

3 responses so far ↓

  • 1 Daniel Lemire // Apr 23, 2008 at 7:01 pm

    Tagging like all metadata does not help IR very much (Zobel has a paper on this, circa 2005).

    However, tagging did inspire me some ideas, here are a couple of plugs:

    Kamel Aouiche, Daniel Lemire, Robert Godin, Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation, WEBIST 2008, 2008.

    Owen Kaser and Daniel Lemire, Tag-Cloud Drawing: Algorithms for Cloud Visualization. In proceedings of Tagging and Metadata for Social Information Organization (WWW 2007), 2007.

  • 2 Daniel Tunkelang // Apr 24, 2008 at 1:02 pm

    Daniel, thanks for the links! Funny how graph drawing has kept coming up this week. I thought I’d left that life behind a decade ago!

  • 3 Daniel Tunkelang // Apr 26, 2008 at 1:08 pm

    Paul Heymann and colleagues conclude that “URLs produced by social bookmarking are unlikely to be numerous enough to impact the crawl ordering of a major [web] search engine, and the tags produced are unlikely to be much more useful than a full text search emphasizing page titles.”

Clicky Web Analytics