Disincenting Spam

Greg called my attention today to news that Digg is shifting from popularity-based aggregation to personalized news. I can’t say I’m thrilled at the prospect of a system that “would make guesses about what [users] like based on information mined from the giant demographic veins of social networks”. I don’t suppose the results are necessarily worse than showing users stories based solely on their popularity, but at least the latter offers me some transparency.

But it was an older post Greg pointed to that caught my attention: “Combating web spam with personalization“. Here is his argument in a nutshell:

Personalized search shows different search results to different people based on their history and their interests. Not only does this increase the relevance of the search results, but also it makes the search results harder to spam.

In this 2006 post, Greg is specifically referring to the personalized search that Google was beta testing back in 2004. Google has since implemented personalized search, but without sharing much detail about how it works.

Nonetheless, Greg’s argument reminds me of one of the first posts I wrote on this blog. I was criticizing Google’s approach of keeping its relevance approach secret and particularly the argument that Amit Singhal has advanced to justify it–that the subjectivity of relevance makes it harder to develop an open approach to relevance. My response: “the subjectivity of relevance should make the adversarial problem easier rather than harder, as has been observed in the security industry”. 

I suppose personalization can help fight spam even if it is not coupled with transparency to the user. But what a great opportunity to do both by providing more user control over the information seeking process.

By Daniel Tunkelang

High-Class Consultant.