<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Google&#8217;s Chief Economist Hal Varian Talks Stats 101</title>
	<atom:link href="http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/feed/" rel="self" type="application/rss+xml" />
	<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/</link>
	<description></description>
	<lastBuildDate>Mon, 21 May 2012 05:21:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/comment-page-1/#comment-4278</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Sat, 22 Aug 2009 17:00:53 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2440#comment-4278</guid>
		<description>Panos, I think we&#039;re on the same page. That&#039;s what I meant by saying that the ability to run twice as many simultaneous tests without noticeably disrupting overall user experience is a major competitive advantage. But, as you point out, that&#039;s a somewhat inconvenient truth if you&#039;re trying to argue that scale isn&#039;t that big an advantage.

BTW, I don&#039;t think this is about Google playing down the the Microsoft-Yahoo agreement. Rather, it&#039;s to dismiss the argument that Google has a monopolistic advantage because of its scale. I.e., he&#039;s arguing that Microsoft doesn&#039;t &lt;i&gt;need&lt;/i&gt; Google&#039;s scale to be competitive.</description>
		<content:encoded><![CDATA[<p>Panos, I think we&#8217;re on the same page. That&#8217;s what I meant by saying that the ability to run twice as many simultaneous tests without noticeably disrupting overall user experience is a major competitive advantage. But, as you point out, that&#8217;s a somewhat inconvenient truth if you&#8217;re trying to argue that scale isn&#8217;t that big an advantage.</p>
<p>BTW, I don&#8217;t think this is about Google playing down the the Microsoft-Yahoo agreement. Rather, it&#8217;s to dismiss the argument that Google has a monopolistic advantage because of its scale. I.e., he&#8217;s arguing that Microsoft doesn&#8217;t <i>need</i> Google&#8217;s scale to be competitive.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Panos Ipeirotis</title>
		<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/comment-page-1/#comment-4277</link>
		<dc:creator>Panos Ipeirotis</dc:creator>
		<pubDate>Sat, 22 Aug 2009 16:48:06 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2440#comment-4277</guid>
		<description>What Varian is saying here is that for tuning you do not need to run the experiment on all users. Correct, if you tune elements on a wide-scale (e.g., the infamous &quot;what shade of blue to use for our interface element&quot;)

However, there are many aspects of tuning in which you do run into sparse data problems. How do you estimate, say, clickthrough for queries that are typed only a few hundred times a day? Or how do you estimate clickthrough for an ad-query pair? In such cases, you *do* have to deal with sparse data, and doubling or tripling the size of users can indeed be beneficial.

I will not even mention the network effects for the bipartite ad network (advertisers will choose a network with many content nodes, content nodes will put ads from a network with many advertisers). I found it rather ironic that Hal Varian, of all people, chose to ignore that aspect.

I understand the need from Google to play down the Microsoft-Yahoo agreement but sometimes it feels like listening to propaganda...</description>
		<content:encoded><![CDATA[<p>What Varian is saying here is that for tuning you do not need to run the experiment on all users. Correct, if you tune elements on a wide-scale (e.g., the infamous &#8220;what shade of blue to use for our interface element&#8221;)</p>
<p>However, there are many aspects of tuning in which you do run into sparse data problems. How do you estimate, say, clickthrough for queries that are typed only a few hundred times a day? Or how do you estimate clickthrough for an ad-query pair? In such cases, you *do* have to deal with sparse data, and doubling or tripling the size of users can indeed be beneficial.</p>
<p>I will not even mention the network effects for the bipartite ad network (advertisers will choose a network with many content nodes, content nodes will put ads from a network with many advertisers). I found it rather ironic that Hal Varian, of all people, chose to ignore that aspect.</p>
<p>I understand the need from Google to play down the Microsoft-Yahoo agreement but sometimes it feels like listening to propaganda&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cuando la escala de datos es irrelevante hasta en Internet &#124; Denken Über</title>
		<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/comment-page-1/#comment-4190</link>
		<dc:creator>Cuando la escala de datos es irrelevante hasta en Internet &#124; Denken Über</dc:creator>
		<pubDate>Fri, 14 Aug 2009 21:32:18 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2440#comment-4190</guid>
		<description>[...] Hay mucha data interesante en el blog de Nick Carr y datos en Search Engine Land y una nota excelente de Daniel Tunkelang [...]</description>
		<content:encoded><![CDATA[<p>[...] Hay mucha data interesante en el blog de Nick Carr y datos en Search Engine Land y una nota excelente de Daniel Tunkelang [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/comment-page-1/#comment-4184</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Fri, 14 Aug 2009 20:03:25 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2440#comment-4184</guid>
		<description>Oops! Corrected now.

And I agree, what you measure and thus optimize for is critical. I still suspect that the bottleneck is creativity, not volume or scale.</description>
		<content:encoded><![CDATA[<p>Oops! Corrected now.</p>
<p>And I agree, what you measure and thus optimize for is critical. I still suspect that the bottleneck is creativity, not volume or scale.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2009/08/14/googles-chief-economist-hal-varian-talks-stats-101/comment-page-1/#comment-4183</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Fri, 14 Aug 2009 19:39:45 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2440#comment-4183</guid>
		<description>&lt;i&gt;But even there quality trumps quality&lt;/i&gt;

Do you mean &quot;quality trumps quantity&quot;?

&lt;i&gt;how you choose what to test matters a lot more than how many tests you run.&lt;/i&gt;

And how you choose to measure what you test matters a lot more than either of those two factors.  It&#039;s a point that we discussed in your comments, before, with Max Wilson, etc.  If someone runs a query and then doesn&#039;t click anything, how can you tell the difference between that search being unsuccessful, because they didn&#039;t find the information that they wanted, and that search being successful, because they found that the information that they wanted didn&#039;t appear?  

For example, someone queries for his or her name, and finds that the embarrassing drunk photo doesn&#039;t show up in the first page of results.  That&#039;s a success!  Or someone writing a patent does a bunch of searches for related work, and finds that there is nothing related.  Success!

How does the search engine measure those scenarios?  Does a 0.5%, 1% or even 2% experiment really give you enough data to tease out the difference in these two types of searches?  Does a 100% experiment even give you enough data?</description>
		<content:encoded><![CDATA[<p><i>But even there quality trumps quality</i></p>
<p>Do you mean &#8220;quality trumps quantity&#8221;?</p>
<p><i>how you choose what to test matters a lot more than how many tests you run.</i></p>
<p>And how you choose to measure what you test matters a lot more than either of those two factors.  It&#8217;s a point that we discussed in your comments, before, with Max Wilson, etc.  If someone runs a query and then doesn&#8217;t click anything, how can you tell the difference between that search being unsuccessful, because they didn&#8217;t find the information that they wanted, and that search being successful, because they found that the information that they wanted didn&#8217;t appear?  </p>
<p>For example, someone queries for his or her name, and finds that the embarrassing drunk photo doesn&#8217;t show up in the first page of results.  That&#8217;s a success!  Or someone writing a patent does a bunch of searches for related work, and finds that there is nothing related.  Success!</p>
<p>How does the search engine measure those scenarios?  Does a 0.5%, 1% or even 2% experiment really give you enough data to tease out the difference in these two types of searches?  Does a 100% experiment even give you enough data?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

