Credit to Kevin Duh by way of the natural language processing blog for highlighting recent work from PARC on understanding the efficiency of social tagging systems using information theory. The authors apply information theory to establish a framework for measuring the efficiency social tagging systems, and then empirically observe that the efficiency of tagging on del.icio.us has been decreasing over time. They conclude by suggesting that current tagging interfaces may be at fault, through a positive feedback process of encouraging popular tags.
After seeing this and the TagMaps work at Yahoo Research Berkeley, I feel that the IR and HCI communities should join forces to understand social tagging in general terms that relate information, knowledge representation, and human beings. These concerns are hardly specific to the web or to what is now called “social media”–after all, media is social by definition. Indeed, there is no reason to confine this approach to human-tagged collections–why not consider automated tagging systems on the same playing field?
3 replies on “The Efficiency of Social Tagging”
Tagging like all metadata does not help IR very much (Zobel has a paper on this, circa 2005).However, tagging did inspire me some ideas, here are a couple of plugs:Kamel Aouiche, Daniel Lemire, Robert Godin, Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation, WEBIST 2008, 2008.http://arxiv.org/abs/0710.2156Owen Kaser and Daniel Lemire, Tag-Cloud Drawing: Algorithms for Cloud Visualization. In proceedings of Tagging and Metadata for Social Information Organization (WWW 2007), 2007.http://arxiv.org/abs/cs.DS/0703109
Daniel, thanks for the links! Funny how graph drawing has kept coming up this week. I thought I’d left that life behind a decade ago!
Paul Heymann and colleagues conclude that “URLs produced by social bookmarking are unlikely to be numerous enough to impact the crawl ordering of a major [web] search engine, and the tags produced are unlikely to be much more useful than a full text search emphasizing page titles.”