Categories
General

A Twitter Analog to PageRank

A few weeks ago, there was a flame war about Twitter authority, and I was all too eager to throw fuel on the pyre. But now that the blogosphere has calmed down a bit, I’d like to propose a ranking measure that I think might work. My apologies if it isn’t original. In fact, if you’ve seen it elsewhere, please point me to it.

Let me start with the assumptions about the model:

  • Influence(X) = Expected number of people who will read a tweet that X tweets, including all retweets of that tweet. For simplicity, we assume that, if a person reads the same message twice (because of retweets), both readings count.
  • If X is a member of Followers(Y), then there is a 1/||Following(X)|| probability that X will read a tweet posted by Y, where Following(X) is the set of people that X follows.
  • If X reads a tweet from Y, there’s a constant probability p that X will retweet it.

This model is obviously simplistic in all three assumptions. But I think it’s a reasonable first cut. In particular, it accounts for the inflation that occurs from people who follow in the hopes of reciprocity. There’s less value in being followed by someone who follows a lot of people, because that person is less likely to read your messages or retweet them.

Of course, there’s room for adding more realism to this model, but I hope it is at least close enough to the truth to be interesting.

From this model, it’s easy to measure someone’s influence recursively, assuming that we know the constant retweet probability p:

equation1

The recursion is infinite over a graph with directed cycles, but rapidly converges as high powers of p approach zero. I would think this measure wouldn’t be hard to compute to a reasonable accuracy.

This measure strikes me as a PageRank for Twitter or any system with similar properties. There’s more room for nuance, but I at least find this approach more plausible than the ones I’ve seen. It also strikes me as hard to game, since it isn’t counting retweets, and it’s hard to add much influence through followers who don’t have any influence themselves.

What do folks think? Has anyone tried this? If not, is there anyone who’d like to try hacking an application to compute it? Either way, please let me know!

By Daniel Tunkelang

High-Class Consultant.

77 replies on “A Twitter Analog to PageRank”

[…] with modeling authority and influence in social networks, a problem in which I take a deep personal interest. Another inferred attributes of social network users based on those of other users in their […]

Like

[…] To me, it’s Google’s responsibility to intervene.  The company that expresses algorithmic prowess on so many complex patterns should have no trouble in doing so with blog engagement.  The raw numbers displayed in feedburner chicklets are no more reliable than the 1990s hit counters which allowed unscrupulous webmasters to “start off” with high numbers in order to mislead the readers that a site was popular.  Perhaps we need a pagerank for Twitter/Friendfeed followers. […]

Like

Hi Dan. I know this is quite a belated follow-up to your post, but I was wondering: as I(X) is the expected number of people who read something by X, how does the method cope with cases where after convergence, I(X) is greater than N, where N the #nodes in the network? In your implementation, do you impose a threshold N for each element of I per iteration?
Thanks and kudos for Tunkrank!

Like

Yannis, it doesn’t try to. While it’s theoretically possible for the model to exceed this threshold, it won’t happen with realistic parameter values. Indeed, the model doesn’t even try to prevent cycles, nor does it avoid double-counting if person reads the same message twice (because of retweets).

Lots of room to make the model more complex / realistic. But I figured it was best to keep it simple and understandable, at least to start off.

Like

I find this to be a very interesting experiment, though I do not tweet myself. Apologies for the delayed comment, my question is if “influence” is really captured by simply reading something (or calculating readers). For someone that read and retweeted, it seems like your tweet may have “influenced” them a bit more than a reader. Perhaps calculating readership is difficult enough that there is no benefit going beyond that. Would like to hear your thoughts.

Like

Tony, that’s a fair point. Influence is a fuzzy concept, but I’m comfortable with any definition that involved getting people to spend a scarce economic good, like their attention.

Like

Comments are closed.