<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Google and Transparency</title>
	<atom:link href="http://thenoisychannel.com/2010/03/07/google-and-transparency/feed/" rel="self" type="application/rss+xml" />
	<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/</link>
	<description></description>
	<lastBuildDate>Sat, 11 Feb 2012 00:39:47 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Reflecting on 2010: Searching for Answers</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-9138</link>
		<dc:creator>Reflecting on 2010: Searching for Answers</dc:creator>
		<pubDate>Fri, 31 Dec 2010 01:30:43 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-9138</guid>
		<description>[...] they&#8217;ve made an impressive attempt to increase the transparency of relevance ranking. But, as I blogged earlier this year, I think that, at least for the time being, Google is making the right decision to keep some of its [...]</description>
		<content:encoded><![CDATA[<p>[...] they&#8217;ve made an impressive attempt to increase the transparency of relevance ranking. But, as I blogged earlier this year, I think that, at least for the time being, Google is making the right decision to keep some of its [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Weekly Suche &#171; hemju &#8211; beste seo und suchmaschinenoptimierung</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5516</link>
		<dc:creator>Weekly Suche &#171; hemju &#8211; beste seo und suchmaschinenoptimierung</dc:creator>
		<pubDate>Wed, 17 Mar 2010 11:17:41 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5516</guid>
		<description>[...] Google und Transparenz &#8211; Noisy Channel [...]</description>
		<content:encoded><![CDATA[<p>[...] Google und Transparenz &#8211; Noisy Channel [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Weekly Suche &#171; hemju Linz SEO</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5515</link>
		<dc:creator>Weekly Suche &#171; hemju Linz SEO</dc:creator>
		<pubDate>Wed, 17 Mar 2010 11:02:33 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5515</guid>
		<description>[...] Google und Transparenz &#8211; Noisy Channel [...]</description>
		<content:encoded><![CDATA[<p>[...] Google und Transparenz &#8211; Noisy Channel [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Weekly Search &#38; Social News: 03/16/2010 &#124; Search Engine Journal</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5510</link>
		<dc:creator>Weekly Search &#38; Social News: 03/16/2010 &#124; Search Engine Journal</dc:creator>
		<pubDate>Tue, 16 Mar 2010 13:41:52 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5510</guid>
		<description>[...] Google and Transparency  - Noisy Channel [...]</description>
		<content:encoded><![CDATA[<p>[...] Google and Transparency  - Noisy Channel [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5479</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Wed, 10 Mar 2010 00:19:38 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5479</guid>
		<description>&quot;I do think it would be interesting to have an independent search engine that was completely transparent and to see if users who tried it stuck with it.&quot;

Agreed.  Caveat, however: transparency alone wouldn&#039;t be enough.  Because not all queries (e.g. navigation) need transparency.  So the independent search engine that we&#039;d need to build to run this experiment has to be equal in quality to Google for navigational queries, with the ability to switch to transparent interaction for informational/exploratory queries.  Given that requirement, Wikia Search does not count as such an experiment.

&quot;But the success of Hadoop that I alluded to earlier is evidence that this transparency isn’t completely obscured.&quot;

Yup, I agreed to the Hadoop/Mapreduce benefits in my comment #8, above.  Still, that&#039;s not a part of the system (other than query response time) that the user sees.  I&#039;m interested in HCIR.

&quot;I’m glad we agree that we’re debating shades of gray.&quot;

Of course.  We&#039;re not republicans, now, are we? :-)

&quot;I agree with colleagues that there’s a real risk to users in over-disclosing the details of ranking. I hope to see the day where it’s possible to have relevance without obscurity.&quot;

One day when you can finally, legally talk about it, I&#039;d like to know what it is you&#039;ve learned since joining the Big G that seems to have moved you more toward this risk-averse state.  Maybe that&#039;ll have to be in 20 years.  But I have sheer academic curiosity about the thinking process that is occurring.</description>
		<content:encoded><![CDATA[<p>&#8220;I do think it would be interesting to have an independent search engine that was completely transparent and to see if users who tried it stuck with it.&#8221;</p>
<p>Agreed.  Caveat, however: transparency alone wouldn&#8217;t be enough.  Because not all queries (e.g. navigation) need transparency.  So the independent search engine that we&#8217;d need to build to run this experiment has to be equal in quality to Google for navigational queries, with the ability to switch to transparent interaction for informational/exploratory queries.  Given that requirement, Wikia Search does not count as such an experiment.</p>
<p>&#8220;But the success of Hadoop that I alluded to earlier is evidence that this transparency isn’t completely obscured.&#8221;</p>
<p>Yup, I agreed to the Hadoop/Mapreduce benefits in my comment #8, above.  Still, that&#8217;s not a part of the system (other than query response time) that the user sees.  I&#8217;m interested in HCIR.</p>
<p>&#8220;I’m glad we agree that we’re debating shades of gray.&#8221;</p>
<p>Of course.  We&#8217;re not republicans, now, are we? <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>&#8220;I agree with colleagues that there’s a real risk to users in over-disclosing the details of ranking. I hope to see the day where it’s possible to have relevance without obscurity.&#8221;</p>
<p>One day when you can finally, legally talk about it, I&#8217;d like to know what it is you&#8217;ve learned since joining the Big G that seems to have moved you more toward this risk-averse state.  Maybe that&#8217;ll have to be in 20 years.  But I have sheer academic curiosity about the thinking process that is occurring.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5477</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Tue, 09 Mar 2010 22:29:33 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5477</guid>
		<description>Fair point re: user education. But it&#039;s not just a question of what users see as possible--it&#039;s also a question what search engines can do, and at what cost. I do think it would be interesting to have an independent search engine that was completely transparent and to see if users who tried it stuck with it. Does &lt;a href=&quot;http://en.wikipedia.org/wiki/Wikia_Search&quot; rel=&quot;nofollow&quot;&gt;Wikia Search&lt;/a&gt; (R.I.P.) count as such an experiment?

As for the research papers, I think Google has made a point of promoting what it believes to be the most important ones--and they are infrastructure papers, not search papers per se. But the success of Hadoop that I alluded to earlier is evidence that this transparency isn&#039;t completely obscured.

I&#039;m glad we agree that we&#039;re debating shades of gray. I&#039;d like to strive toward a lighter shade than the current one, but I agree with colleagues that there&#039;s a real risk to users in over-disclosing the details of ranking. I hope to see the day where it&#039;s possible to have relevance without obscurity.</description>
		<content:encoded><![CDATA[<p>Fair point re: user education. But it&#8217;s not just a question of what users see as possible&#8211;it&#8217;s also a question what search engines can do, and at what cost. I do think it would be interesting to have an independent search engine that was completely transparent and to see if users who tried it stuck with it. Does <a href="http://en.wikipedia.org/wiki/Wikia_Search" rel="nofollow">Wikia Search</a> (R.I.P.) count as such an experiment?</p>
<p>As for the research papers, I think Google has made a point of promoting what it believes to be the most important ones&#8211;and they are infrastructure papers, not search papers per se. But the success of Hadoop that I alluded to earlier is evidence that this transparency isn&#8217;t completely obscured.</p>
<p>I&#8217;m glad we agree that we&#8217;re debating shades of gray. I&#8217;d like to strive toward a lighter shade than the current one, but I agree with colleagues that there&#8217;s a real risk to users in over-disclosing the details of ranking. I hope to see the day where it&#8217;s possible to have relevance without obscurity.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5474</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Tue, 09 Mar 2010 16:44:58 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5474</guid>
		<description>&quot;But it still is a disclosure that sheds light on the black box, and I think that supports Matt’s general argument.&quot;

Matt&#039;s general argument isn&#039;t that there are things about Google that are open/known.  His argument is that Google the company actively has an agenda to make more things transparent,  carries through with that agenda, and that Google should get more credit for making things transparent (the active followthrough) than its critics give it.

Because PageRank was disclosed by NSF-funded grad students doing their job of publishing, and not by Google, it doesn&#039;t support Matt&#039;s argument at all.  

I&#039;m sorry that I&#039;m being such a stickler about this point, but I see it quite strongly.  What we&#039;re talking about is what  Google as a company is actively doing. 

For that matter, Cutts writes: &quot;Google has continued to publish literally hundreds of research papers over the years. Those papers reveal many of the &quot;secret formulas&quot; for how Google works&quot;

I&#039;ll ask the question again: &quot;And sure, there are hundreds of papers by Googlers. But how many of those papers were done (primarily) by summer interns, who already came to Google with a solid sense of a research idea? And of the ones that are Google-internal only, how many of those are on production-code systems, vs raw research? Microsoft publishes hundreds of papers as well, the vast majority of which are never incorporated into shipped products. Does that also make Microsoft transparent?&quot;

In other words, how much of this is &quot;transparency through obscurity&quot;.  I.e. if there are hundreds of papers with secret formulas in them, and only 7 of those formulas actually make it into production code, and it is never make known which 7 those are, then how can this reasonably be called transparency?  If there is so much flak thrown up, then any real disclosure is essentially so obfuscated so as to not really be transparency at all.

I do agree with you, Daniel, that it&#039;s not a black and white issue.  Google doesn&#039;t get an A+ on transparency, nor do they get an F-.  But the picture is also not quite as clean as Cutts paints it, either.</description>
		<content:encoded><![CDATA[<p>&#8220;But it still is a disclosure that sheds light on the black box, and I think that supports Matt’s general argument.&#8221;</p>
<p>Matt&#8217;s general argument isn&#8217;t that there are things about Google that are open/known.  His argument is that Google the company actively has an agenda to make more things transparent,  carries through with that agenda, and that Google should get more credit for making things transparent (the active followthrough) than its critics give it.</p>
<p>Because PageRank was disclosed by NSF-funded grad students doing their job of publishing, and not by Google, it doesn&#8217;t support Matt&#8217;s argument at all.  </p>
<p>I&#8217;m sorry that I&#8217;m being such a stickler about this point, but I see it quite strongly.  What we&#8217;re talking about is what  Google as a company is actively doing. </p>
<p>For that matter, Cutts writes: &#8220;Google has continued to publish literally hundreds of research papers over the years. Those papers reveal many of the &#8220;secret formulas&#8221; for how Google works&#8221;</p>
<p>I&#8217;ll ask the question again: &#8220;And sure, there are hundreds of papers by Googlers. But how many of those papers were done (primarily) by summer interns, who already came to Google with a solid sense of a research idea? And of the ones that are Google-internal only, how many of those are on production-code systems, vs raw research? Microsoft publishes hundreds of papers as well, the vast majority of which are never incorporated into shipped products. Does that also make Microsoft transparent?&#8221;</p>
<p>In other words, how much of this is &#8220;transparency through obscurity&#8221;.  I.e. if there are hundreds of papers with secret formulas in them, and only 7 of those formulas actually make it into production code, and it is never make known which 7 those are, then how can this reasonably be called transparency?  If there is so much flak thrown up, then any real disclosure is essentially so obfuscated so as to not really be transparency at all.</p>
<p>I do agree with you, Daniel, that it&#8217;s not a black and white issue.  Google doesn&#8217;t get an A+ on transparency, nor do they get an F-.  But the picture is also not quite as clean as Cutts paints it, either.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5472</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Tue, 09 Mar 2010 16:24:13 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5472</guid>
		<description>&quot;I believe this is already happening in markets like enterprise search, but I don’t see the evidence for it in web search.&quot;

How much of this is chicken/egg?  I remember Steve Jobs claiming that there was no evidence that users wanted to watch video on an extremely small screen.  Six months later Apple released the video iPod, and now lots of people do.

If users don&#039;t know or don&#039;t understand what is possible, they&#039;ll never ask for it.  There has to be some education, first.  How much education is Google doing?</description>
		<content:encoded><![CDATA[<p>&#8220;I believe this is already happening in markets like enterprise search, but I don’t see the evidence for it in web search.&#8221;</p>
<p>How much of this is chicken/egg?  I remember Steve Jobs claiming that there was no evidence that users wanted to watch video on an extremely small screen.  Six months later Apple released the video iPod, and now lots of people do.</p>
<p>If users don&#8217;t know or don&#8217;t understand what is possible, they&#8217;ll never ask for it.  There has to be some education, first.  How much education is Google doing?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5470</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Tue, 09 Mar 2010 14:04:45 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5470</guid>
		<description>Fair enough, I&#039;ll concede that Google as a company doesn&#039;t get credit for publishing the paper. But it still is a disclosure that sheds light on the black box, and I think that supports Matt&#039;s general argument.

But you raise an interesting point. I&#039;m with you on the HCIR perspective of focusing on end-user penetrability. And Gregor is a refreshing example of a user who cares about the details of how the search engine works.

But I see far greater concern about the ranking details by site owners than by users, e.g. a lot more literature about site owners can improve their site ranking than on how users can search more effectively.

According to the Christensen&#039;s theory of disruptive technology (at least &lt;a href=&quot;http://en.wikipedia.org/wiki/Disruptive_technology#The_theory&quot; rel=&quot;nofollow&quot;&gt;Wikipedia&#039;s explanation&lt;/a&gt;), &quot; a disruptive technology may enter the market and provide a product which has lower performance than the incumbent but which exceeds the requirements of certain segments, thereby gaining a foothold in the market.&quot; If users value transparency, then there should be room for someone to offer a product that lags the incumbent on most measures but offers greater transparency. I believe this is already happening in markets like enterprise search, but I don&#039;t see the evidence for it in web search.

And, to be clear, I&#039;ve &lt;a href=&quot;http://thenoisychannel.com/2009/01/08/google-tech-talk-reconsidering-relevance/&quot; rel=&quot;nofollow&quot;&gt;advocated&lt;/a&gt; for it! But that&#039;s doesn&#039;t make it so.</description>
		<content:encoded><![CDATA[<p>Fair enough, I&#8217;ll concede that Google as a company doesn&#8217;t get credit for publishing the paper. But it still is a disclosure that sheds light on the black box, and I think that supports Matt&#8217;s general argument.</p>
<p>But you raise an interesting point. I&#8217;m with you on the HCIR perspective of focusing on end-user penetrability. And Gregor is a refreshing example of a user who cares about the details of how the search engine works.</p>
<p>But I see far greater concern about the ranking details by site owners than by users, e.g. a lot more literature about site owners can improve their site ranking than on how users can search more effectively.</p>
<p>According to the Christensen&#8217;s theory of disruptive technology (at least <a href="http://en.wikipedia.org/wiki/Disruptive_technology#The_theory" rel="nofollow">Wikipedia&#8217;s explanation</a>), &#8221; a disruptive technology may enter the market and provide a product which has lower performance than the incumbent but which exceeds the requirements of certain segments, thereby gaining a foothold in the market.&#8221; If users value transparency, then there should be room for someone to offer a product that lags the incumbent on most measures but offers greater transparency. I believe this is already happening in markets like enterprise search, but I don&#8217;t see the evidence for it in web search.</p>
<p>And, to be clear, I&#8217;ve <a href="http://thenoisychannel.com/2009/01/08/google-tech-talk-reconsidering-relevance/" rel="nofollow">advocated</a> for it! But that&#8217;s doesn&#8217;t make it so.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5468</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Mon, 08 Mar 2010 18:14:33 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5468</guid>
		<description>&lt;i&gt;Given that Larry and Sergey are founders and major shareholders, I think Google should get at least a little credit for their actions.&lt;/i&gt;

With all respect, I still strongly disagree about giving even a little credit -- not to L&amp;S -- but to Google.  You have to look at the motivation of Larry and Sergey, and what was going on at the time.  At the time that they actually submitted the paper for publication, they were researchers at a university and funded by NSF money.  Their job, their primary product, their deliverable, was publications.  So when they published PageRank they did so before they ever started Google the company, and did so because it was the mandate of their academic position and government funding to do so.   Not because of any attempt to be more or less transparent.   Had they not published the paper, they would have not been properly doing their jobs at that time.  Transparency had nothing to do with it.

I hear what you&#039;re saying; they are indeed the founders of Google.  But I still do not see how you can give Google credit for something that was a requirement of a pre-Google job.  Doesn&#039;t make any sense.

But I think Gregor gets to the heart of this issue, above.  What Gregor is saying is that it doesn&#039;t really matter what the exact mathematics of the Google ranking algorithm is.  What matters is the end-user penetrability of that algorithm.  Is Google transparent enough so that a searcher using the Google system can give Google the right signals, in the right way, at the right time, so as to find the information that is needed?  And I think the answer to that is still no.  It&#039;s changing, and I do see small HCIR signs here and there.  But at the end of the day, that&#039;s the question that really matters.  

Of course, making Google more HCIR-transparent will most likely involve exposing, in some fashion or other, more of the Google ranking algorithm.  It might not be a raw exposure, but there will still have to be an exposure of some sort in order to give the users more control over their experience.  

So the question: How transparent is Google in that regard? 

From that perspective, all those papers on MapReduce (which I do give transparency credit for) and PageRank (which I don&#039;t give transparency credit for -- it was their job at the time!) don&#039;t matter one way or the other.  Because the technical details contained in those papers are still not exposed to the end user in a transparent way.  There is no way for the end user to say &quot;Stop using the PageRank popularity signal for my query, because I know that what I am looking for is in the tail!&quot;  

Transparency is therefore still at a minimum for what really matters: End user experience.</description>
		<content:encoded><![CDATA[<p><i>Given that Larry and Sergey are founders and major shareholders, I think Google should get at least a little credit for their actions.</i></p>
<p>With all respect, I still strongly disagree about giving even a little credit &#8212; not to L&amp;S &#8212; but to Google.  You have to look at the motivation of Larry and Sergey, and what was going on at the time.  At the time that they actually submitted the paper for publication, they were researchers at a university and funded by NSF money.  Their job, their primary product, their deliverable, was publications.  So when they published PageRank they did so before they ever started Google the company, and did so because it was the mandate of their academic position and government funding to do so.   Not because of any attempt to be more or less transparent.   Had they not published the paper, they would have not been properly doing their jobs at that time.  Transparency had nothing to do with it.</p>
<p>I hear what you&#8217;re saying; they are indeed the founders of Google.  But I still do not see how you can give Google credit for something that was a requirement of a pre-Google job.  Doesn&#8217;t make any sense.</p>
<p>But I think Gregor gets to the heart of this issue, above.  What Gregor is saying is that it doesn&#8217;t really matter what the exact mathematics of the Google ranking algorithm is.  What matters is the end-user penetrability of that algorithm.  Is Google transparent enough so that a searcher using the Google system can give Google the right signals, in the right way, at the right time, so as to find the information that is needed?  And I think the answer to that is still no.  It&#8217;s changing, and I do see small HCIR signs here and there.  But at the end of the day, that&#8217;s the question that really matters.  </p>
<p>Of course, making Google more HCIR-transparent will most likely involve exposing, in some fashion or other, more of the Google ranking algorithm.  It might not be a raw exposure, but there will still have to be an exposure of some sort in order to give the users more control over their experience.  </p>
<p>So the question: How transparent is Google in that regard? </p>
<p>From that perspective, all those papers on MapReduce (which I do give transparency credit for) and PageRank (which I don&#8217;t give transparency credit for &#8212; it was their job at the time!) don&#8217;t matter one way or the other.  Because the technical details contained in those papers are still not exposed to the end user in a transparent way.  There is no way for the end user to say &#8220;Stop using the PageRank popularity signal for my query, because I know that what I am looking for is in the tail!&#8221;  </p>
<p>Transparency is therefore still at a minimum for what really matters: End user experience.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Simon</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5466</link>
		<dc:creator>Phil Simon</dc:creator>
		<pubDate>Mon, 08 Mar 2010 15:24:23 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5466</guid>
		<description>I agree that this is a really massive subject for a blog post. However, I agree that Google can&#039;t give away the store. I have no problem with not knowing every detail in how Google ranks pages. I&#039;d be silly to presume that I would be able to understand it all anyway.</description>
		<content:encoded><![CDATA[<p>I agree that this is a really massive subject for a blog post. However, I agree that Google can&#8217;t give away the store. I have no problem with not knowing every detail in how Google ranks pages. I&#8217;d be silly to presume that I would be able to understand it all anyway.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregor Erbach</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5465</link>
		<dc:creator>Gregor Erbach</dc:creator>
		<pubDate>Mon, 08 Mar 2010 14:18:00 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5465</guid>
		<description>Well, I liked the dash because I am lazy - it was just so easy to replace a blank with a dash -- phrase search without any extra keystrokes.  But I am not complaining - Google has done a lot to make searching easy and effective.</description>
		<content:encoded><![CDATA[<p>Well, I liked the dash because I am lazy &#8211; it was just so easy to replace a blank with a dash &#8212; phrase search without any extra keystrokes.  But I am not complaining &#8211; Google has done a lot to make searching easy and effective.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5464</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Mon, 08 Mar 2010 13:33:14 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5464</guid>
		<description>Gregor, I understand changes in search bavior over time can be confusing, but I think Google has tried to explain many of them. &lt;a href=&quot;http://googleblog.blogspot.com/2010/01/helping-computers-understand-language.html&quot; rel=&quot;nofollow&quot;&gt;Contextual synonym inference&lt;/a&gt; may explain how adding a term makes the result set larger--though I can&#039;t know for sure without seeing your particular example. Localization of results is something Google has &lt;a href=&quot;http://googleblog.blogspot.com/2008/07/technologies-behind-google-ranking.html&quot; rel=&quot;nofollow&quot;&gt;publicized&lt;/a&gt; with pride. And I&#039;m not sure why you&#039;re suing a dash rather than quotation marks to specify phrases--you might want to look at this &lt;a href=&quot;http://www.google.com/support/websearch/bin/answer.py?hl=en&amp;answer=136861&quot; rel=&quot;nofollow&quot;&gt;help page&lt;/a&gt;

In any case, I agree that it would be great for researchers to run different algorithms and approaches on a Google-size index and with real-life queries. I think researchers are in a position to build a sufficiently large index--in part because of the publications I describe above. But what researchers really want--at least from what I have heard--is access to query logs and actual user traffic. That raises major privacy concerns for users, as well as concern about abuse by spammers. Don&#039;t forget &lt;a href=&quot;http://en.wikipedia.org/wiki/AOL_search_data_scandal&quot; rel=&quot;nofollow&quot;&gt;what happened a few years ago&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>Gregor, I understand changes in search bavior over time can be confusing, but I think Google has tried to explain many of them. <a href="http://googleblog.blogspot.com/2010/01/helping-computers-understand-language.html" rel="nofollow">Contextual synonym inference</a> may explain how adding a term makes the result set larger&#8211;though I can&#8217;t know for sure without seeing your particular example. Localization of results is something Google has <a href="http://googleblog.blogspot.com/2008/07/technologies-behind-google-ranking.html" rel="nofollow">publicized</a> with pride. And I&#8217;m not sure why you&#8217;re suing a dash rather than quotation marks to specify phrases&#8211;you might want to look at this <a href="http://www.google.com/support/websearch/bin/answer.py?hl=en&#038;answer=136861" rel="nofollow">help page</a></p>
<p>In any case, I agree that it would be great for researchers to run different algorithms and approaches on a Google-size index and with real-life queries. I think researchers are in a position to build a sufficiently large index&#8211;in part because of the publications I describe above. But what researchers really want&#8211;at least from what I have heard&#8211;is access to query logs and actual user traffic. That raises major privacy concerns for users, as well as concern about abuse by spammers. Don&#8217;t forget <a href="http://en.wikipedia.org/wiki/AOL_search_data_scandal" rel="nofollow">what happened a few years ago</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gregor Erbach</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5462</link>
		<dc:creator>Gregor Erbach</dc:creator>
		<pubDate>Mon, 08 Mar 2010 06:01:35 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5462</guid>
		<description>It is no wonder that Google&#039;s algorithms get a reputation for being intransparent. For a long time, adding a search term to a Google query would give a smaller result set; recently I added a search term and the result set got larger! Long, long time ago, a Google query would give the same results all over the world, now I get wildly different results depending on where I am (being in Belgium, I get mostly Dutch-language pages).  And for about a decade connecting search terms with a dash operator meant that the terms should occur as a phrase in the document -- not any longer.  No wonder I use Google with a feeling that I don&#039;t really know what&#039;s going on under the hood. 

Anyway, there are good reasons for obscurity for the commercial and search quality reasons you mentioned. However, in order to advance scientific understanding of search quality, there is also a need for transparency. I mean that it should be possible for IR and HCIR researchers to run different algorithms and approaches on a Gooogle-size index and with real-life queries, and publish the algorithms and results.  It would be nice to see Google contribute to such an endeavour, in order to prevent a situation where the knowledge about how to deliver a great search experience is confined to only a handful of big companies.</description>
		<content:encoded><![CDATA[<p>It is no wonder that Google&#8217;s algorithms get a reputation for being intransparent. For a long time, adding a search term to a Google query would give a smaller result set; recently I added a search term and the result set got larger! Long, long time ago, a Google query would give the same results all over the world, now I get wildly different results depending on where I am (being in Belgium, I get mostly Dutch-language pages).  And for about a decade connecting search terms with a dash operator meant that the terms should occur as a phrase in the document &#8212; not any longer.  No wonder I use Google with a feeling that I don&#8217;t really know what&#8217;s going on under the hood. </p>
<p>Anyway, there are good reasons for obscurity for the commercial and search quality reasons you mentioned. However, in order to advance scientific understanding of search quality, there is also a need for transparency. I mean that it should be possible for IR and HCIR researchers to run different algorithms and approaches on a Gooogle-size index and with real-life queries, and publish the algorithms and results.  It would be nice to see Google contribute to such an endeavour, in order to prevent a situation where the knowledge about how to deliver a great search experience is confined to only a handful of big companies.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5459</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Mon, 08 Mar 2010 02:37:16 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5459</guid>
		<description>Given that Larry and Sergey are founders and major shareholders, I think Google should get at least a little credit for their actions. :-)

And yes, the measure doesn&#039;t conform precisely to the paper--and in any case it&#039;s only &lt;a href=&quot;http://searchengineland.com/googles-norvig-pagerank-is-overhyped-37282&quot; rel=&quot;nofollow&quot;&gt;one factor&lt;/a&gt;. But, to put it in the context of Matt&#039;s post: &quot;One of the most widely-discussed parts of Google&#039;s scoring has always been PageRank. That &#039;secret ingredient&#039; is hardly a secret.&quot;

As for the papers, MapReduce, the Google File System, Bigtable, and Protocol Buffers are essential production tools. I agree with Matt that Google deserves credit for enabling &lt;a href=&quot;http://en.wikipedia.org/wiki/Hadoop&quot; rel=&quot;nofollow&quot;&gt;Hadoop&lt;/a&gt;, thus &lt;a href=&quot;http://developer.yahoo.net/blogs/hadoop/2008/02/yahoo-worlds-largest-production-hadoop.html&quot; rel=&quot;nofollow&quot;&gt;supporting its largest web search competitor&lt;/a&gt;.

In any case, no one at Google denies keeping the precise details of ranking secret. But I&#039;d say that critics who extrapolate that the entire ranking algorithm is a black box overstate their case, given how much has been disclosed. Unless you feel that being transparent is like being pregnant--an all or nothing deal. I don&#039;t believe that.

Of course, the other issue is Google&#039;s motive for what secrecy it does keep. As per an official blog post by Google&#039;s counsel, Google stands accused of &lt;a href=&quot;http://googlepublicpolicy.blogspot.com/2010/02/committed-to-competing-fairly.html&quot; rel=&quot;nofollow&quot;&gt;demoting the positions of competitive sites&lt;/a&gt;. I don&#039;t personally believe this to be the case, nor have I even seen anyone present evidence for it. I do understand how some people--particularly site owners unhappy with their ranking--may feel Google is guilty unless it discloses everything in order to prove itself innocent. Given the stakes, I don&#039;t think it&#039;s reasonable to expect Google to make such a sacrifice--at the expense not only of its own competitive position but also of its users.

Jeremy, I know you have many points of disagreement with Google&#039;s approach, particularly with regard to transparency.  Perhaps the extent of Google&#039;s disclosure isn&#039;t enough to earn a passing score in your book. I&#039;m hardly giving it an A+. But I hope we can agree that it isn&#039;t black and white.</description>
		<content:encoded><![CDATA[<p>Given that Larry and Sergey are founders and major shareholders, I think Google should get at least a little credit for their actions. <img src='http://thenoisychannel.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>And yes, the measure doesn&#8217;t conform precisely to the paper&#8211;and in any case it&#8217;s only <a href="http://searchengineland.com/googles-norvig-pagerank-is-overhyped-37282" rel="nofollow">one factor</a>. But, to put it in the context of Matt&#8217;s post: &#8220;One of the most widely-discussed parts of Google&#8217;s scoring has always been PageRank. That &#8216;secret ingredient&#8217; is hardly a secret.&#8221;</p>
<p>As for the papers, MapReduce, the Google File System, Bigtable, and Protocol Buffers are essential production tools. I agree with Matt that Google deserves credit for enabling <a href="http://en.wikipedia.org/wiki/Hadoop" rel="nofollow">Hadoop</a>, thus <a href="http://developer.yahoo.net/blogs/hadoop/2008/02/yahoo-worlds-largest-production-hadoop.html" rel="nofollow">supporting its largest web search competitor</a>.</p>
<p>In any case, no one at Google denies keeping the precise details of ranking secret. But I&#8217;d say that critics who extrapolate that the entire ranking algorithm is a black box overstate their case, given how much has been disclosed. Unless you feel that being transparent is like being pregnant&#8211;an all or nothing deal. I don&#8217;t believe that.</p>
<p>Of course, the other issue is Google&#8217;s motive for what secrecy it does keep. As per an official blog post by Google&#8217;s counsel, Google stands accused of <a href="http://googlepublicpolicy.blogspot.com/2010/02/committed-to-competing-fairly.html" rel="nofollow">demoting the positions of competitive sites</a>. I don&#8217;t personally believe this to be the case, nor have I even seen anyone present evidence for it. I do understand how some people&#8211;particularly site owners unhappy with their ranking&#8211;may feel Google is guilty unless it discloses everything in order to prove itself innocent. Given the stakes, I don&#8217;t think it&#8217;s reasonable to expect Google to make such a sacrifice&#8211;at the expense not only of its own competitive position but also of its users.</p>
<p>Jeremy, I know you have many points of disagreement with Google&#8217;s approach, particularly with regard to transparency.  Perhaps the extent of Google&#8217;s disclosure isn&#8217;t enough to earn a passing score in your book. I&#8217;m hardly giving it an A+. But I hope we can agree that it isn&#8217;t black and white.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5457</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Mon, 08 Mar 2010 01:02:23 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5457</guid>
		<description>While this topic deserves a longer response, let me just quickly say that I don&#039;t think Google should be trying to claim any credit for publishing the &quot;Anatomy of a Large Scale Hypertextual Web Search Engine&quot; paper.  Why?  Because Google the company never published this paper.  Sergey and Larry, the grad students, published this paper.  Their affiliations on the paper are &quot;Stanford&quot;, not Google.  

So it is disingenuous, at the very least, to give Google credit for this.  This information was released (at least submitted for publication with the intention of being released) before Google the company existed.  So whatever L&amp;S&#039;s attitudes about transparency were as grad students, I think it changed once they became incorporated as Google.  

And I cannot count the number of times I&#039;ve seen Google talks in which the Google speaker has intentionally gone out of his or her way to mention that the form of PageRank that they use in today&#039;s engine is nothing like the one published in the paper.  I don&#039;t know if that&#039;s the truth, or a deliberate attempt at obfuscation.  But the upshot is that this means that Google is even less transparent than claimed, because it has the feel of trying to throw people off track -- now one doesn&#039;t know what to believe anymore.. the published paper or the research talks.

So to go back now and try and claim &quot;transparency credit&quot; for this paper... makes me uneasy.

And sure, there are hundreds of papers by Googlers.  But how many of those papers were done (primarily) by summer interns, who already came to Google with a solid sense of a research idea?  And of the ones that are Google-internal only, how many of those are on production-code systems, vs raw research?  Microsoft publishes hundreds of papers as well, the vast majority of which are never incorporated into shipped products.  Does that also make Microsoft transparent?</description>
		<content:encoded><![CDATA[<p>While this topic deserves a longer response, let me just quickly say that I don&#8217;t think Google should be trying to claim any credit for publishing the &#8220;Anatomy of a Large Scale Hypertextual Web Search Engine&#8221; paper.  Why?  Because Google the company never published this paper.  Sergey and Larry, the grad students, published this paper.  Their affiliations on the paper are &#8220;Stanford&#8221;, not Google.  </p>
<p>So it is disingenuous, at the very least, to give Google credit for this.  This information was released (at least submitted for publication with the intention of being released) before Google the company existed.  So whatever L&amp;S&#8217;s attitudes about transparency were as grad students, I think it changed once they became incorporated as Google.  </p>
<p>And I cannot count the number of times I&#8217;ve seen Google talks in which the Google speaker has intentionally gone out of his or her way to mention that the form of PageRank that they use in today&#8217;s engine is nothing like the one published in the paper.  I don&#8217;t know if that&#8217;s the truth, or a deliberate attempt at obfuscation.  But the upshot is that this means that Google is even less transparent than claimed, because it has the feel of trying to throw people off track &#8212; now one doesn&#8217;t know what to believe anymore.. the published paper or the research talks.</p>
<p>So to go back now and try and claim &#8220;transparency credit&#8221; for this paper&#8230; makes me uneasy.</p>
<p>And sure, there are hundreds of papers by Googlers.  But how many of those papers were done (primarily) by summer interns, who already came to Google with a solid sense of a research idea?  And of the ones that are Google-internal only, how many of those are on production-code systems, vs raw research?  Microsoft publishes hundreds of papers as well, the vast majority of which are never incorporated into shipped products.  Does that also make Microsoft transparent?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: renaissance chambara &#124; Ged Carroll - Links of the day</title>
		<link>http://thenoisychannel.com/2010/03/07/google-and-transparency/comment-page-1/#comment-5455</link>
		<dc:creator>renaissance chambara &#124; Ged Carroll - Links of the day</dc:creator>
		<pubDate>Mon, 08 Mar 2010 00:01:42 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2990#comment-5455</guid>
		<description>[...] Google and Transparency [...]</description>
		<content:encoded><![CDATA[<p>[...] Google and Transparency [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

