The Noisy Channel

 

Claudia Perlich: Tech Talk on Real-Time Bidding Optimization

March 22nd, 2012 · 2 Comments · General

Conventional wisdom holds that physical compliments are counter-productive as pick-up lines. Indeed, a dating site did some analysis showing a negative correlation between such compliments and the probability of a positive response.

But, as m6d Chief Scientist and 3-time KDD Cup winner Claudia Perlich explained in her recent talk at LinkedIn, we have to watch out for confounding variables. In the dating scenario above, beauty is a confounding variable: it determines both the probability of getting a positive response and of the probability of a suitor offering physical compliments. Hence, we need to control for the actual beauty or it can appear that making compliments is a bad idea.

Perlich does not work on online dating, but rather in the data-driven world of online advertising. Specifically, she and her team work on real-time bidding optimization.

Perlich described a variety of design choices that have general applicability to data science problems. For example, her team used hashed tokens of previously visited URLs, rather than the URLs themselves, as features for their machine learning models. They avoided the use of personally identifying information (PII) or even demographic information about their users. These decisions were counterintuitive — typically, more data leads to better results. But Perlich found that these restrictions did not sacrifice accuracy, and had the further benefit of keeping their approach general rather than application- or customer-specific.

Perlich also described several technical challenges that her team had to overcome. For example, they found they could not sample users, so they instead sampled events — that is, visits, impressions, and conversions. They also found that their linear models tended to suffer from overfitting in their top predictions — a problem they resolved by introducing a spline model.

The talk was deeply technical and yet very relevant and accessible to a broad audience of data scientists and engineers. There’s much more content than fits in this small summary, so I encourage you to watch the video! And you can watch more LinkedIn tech talks here.

2 responses so far ↓

  • 1 Yuval // Jun 4, 2012 at 11:39 pm

    Well, I have finally gotten round to see this. The topic is very interesting, but Dr. Perlich’s slides are semi-invisible in the video, which makes it a challenge to understand. Is there a publicly accessible version of the slides?
    TIA,
    Yuval

  • 2 Daniel Tunkelang // Jun 5, 2012 at 9:22 am

    Check out http://strataconf.com/strata2012/public/schedule/detail/22604 for her slides from the O’Reilly Strata Conference.

Clicky Web Analytics