Privacy through Difficulty

I had lunch today with Harr Chen, a graduate student at MIT, and we were talking about the consequences of information efficiency for privacy.

A nice example is the company pages on LinkedIn. No company, to my knowledge, publishes statistics on:

  • the schools their employees attended.
  • the companies where their employees previously worked.
  • the companies where their ex-employees work next.

If a company maintains these statistics, it surely considers them to be sensitive and confidential. Nonetheless, by aggregating information from member profiles, LinkedIn computes best guesses at these statistics and makes them public.

Arguably, information like this was never truly private, but was simply so difficult to aggregate that nobody bothered. As Harr aptly put it, they practiced “privacy through difficulty”–a privacy analog to security through obscurity.

Some people are terrified by the increasing efficiency of the information market and look for legal remedies as a last ditch attempt to protect their privacy. I am inclined towards the other extreme (see my previous post on privacy and information theory): let’s assume that information flow is efficient and confront the consequences honestly. Then we can have an informed conversation about information privacy.

By Daniel Tunkelang

High-Class Consultant.

10 replies on “Privacy through Difficulty”

It occurred to me that some might see a contradiction between this post and the previous week’s post on Accessibility in Information Retrieval. Here, I’m suggesting that difficult-to-access content shouldn’t be considered secure; there I’m suggesting that difficult-to-access content shouldn’t be considered accessible.Of course, these are different use cases. Still, it’s worth keeping in mind that different users have different motives. What prevents a casual user from accessing information won’t stop a sufficiently determined one.


[…] What I particularly like in his “filter failure” characterization is that it really exposes the human-computer interaction challenges in managing information flow (in both directions). It also reminds me of Danah Boyd’s Master’s Thesis on managing identity in a digital world, and of some earlier discussion here about privacy through difficulty. […]


The true danger lies in federation of multiple sources, with hard-to-predict consequences to the consumer. Can we come up with a technological solution that would give the user control (before or after the fact) about what data about that person can be aggregated? If this aggregation is of value to some, is there a way to monetize that, to have the consumer derive a revenue stream from the reuse of their data?


Sure, but I don’t see how anyone can control that. I can’t imagine even a theoretical framework where X and Y are both public, but the aggregation of X and Y is not. I think that we as a society have to start expecting that the information we disclose will be combined, so that those consequences are less hard to predict. The recent “please rob me” story highlights that we have a ways to go before we have rational conversations about privacy.


Comments are closed.