<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Blogs I Read: Chris Dixon (cdixon.org)</title>
	<atom:link href="http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/feed/" rel="self" type="application/rss+xml" />
	<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/</link>
	<description></description>
	<lastBuildDate>Sat, 11 Feb 2012 00:39:47 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/comment-page-1/#comment-4309</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Mon, 31 Aug 2009 02:40:59 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2500#comment-4309</guid>
		<description>&lt;i&gt;Another way of looking at this is that it’s often more important to pick a good objective function (and particularly the right inputs) than to choose the best algorithm for optimizing relative to that objective function.&lt;/i&gt;

I... yes ok.  If we&#039;re just making that narrow of a claim, I think I might be able to get on board.  

I still have this nagging feeling though.. think about it this way: What is the different between (a) choosing the right input to your learning algorithm, and (b) choosing the right constraint on that that in the learning algorithm?

Everyone else seems to be expressing a preference for (a).  But I think that they&#039;re equivalent.  And have different advantages and disadvantages.

I think I&#039;ll blog about it tomorrow, using a music information retrieval example.  Will let you know when it&#039;s up.</description>
		<content:encoded><![CDATA[<p><i>Another way of looking at this is that it’s often more important to pick a good objective function (and particularly the right inputs) than to choose the best algorithm for optimizing relative to that objective function.</i></p>
<p>I&#8230; yes ok.  If we&#8217;re just making that narrow of a claim, I think I might be able to get on board.  </p>
<p>I still have this nagging feeling though.. think about it this way: What is the different between (a) choosing the right input to your learning algorithm, and (b) choosing the right constraint on that that in the learning algorithm?</p>
<p>Everyone else seems to be expressing a preference for (a).  But I think that they&#8217;re equivalent.  And have different advantages and disadvantages.</p>
<p>I think I&#8217;ll blog about it tomorrow, using a music information retrieval example.  Will let you know when it&#8217;s up.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/comment-page-1/#comment-4308</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Mon, 31 Aug 2009 00:11:15 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2500#comment-4308</guid>
		<description>Well, I obviously can&#039;t speak for Chris, but I think he means discovering data you haven&#039;t been using, rather than improving the feature selection on the data that you have been.

In particular, I read his example of PageRank as arguing that Google&#039;s improvement over the dominant approaches it displaced as being its introducing the use of links, rather than its simply assigning more weight to them relative to other factors.

Another way of looking at this is that it&#039;s often more important to pick a good objective function (and particularly the right inputs) than to choose the best algorithm for optimizing relative to that objective function.</description>
		<content:encoded><![CDATA[<p>Well, I obviously can&#8217;t speak for Chris, but I think he means discovering data you haven&#8217;t been using, rather than improving the feature selection on the data that you have been.</p>
<p>In particular, I read his example of PageRank as arguing that Google&#8217;s improvement over the dominant approaches it displaced as being its introducing the use of links, rather than its simply assigning more weight to them relative to other factors.</p>
<p>Another way of looking at this is that it&#8217;s often more important to pick a good objective function (and particularly the right inputs) than to choose the best algorithm for optimizing relative to that objective function.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/comment-page-1/#comment-4307</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Sun, 30 Aug 2009 23:56:18 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2500#comment-4307</guid>
		<description>&lt;i&gt;looking at the right data trumps picking the right data mining algorithm&lt;/i&gt;

Just to clarify: We&#039;re not talking about good feature selection, are we?  (i.e. just picking the right attributes out of whatever data we currently have.)  

And we&#039;re also not talking about intelligently-initialized machine learning algorithms (i.e. what some might call &quot;structured&quot; learning, or initializing your learning algorithm with domain-dependent knowledge so as to guide the learning algorithm into the best task-specific models.)

Instead, we&#039;re talking about simply identifying new sources of data?  Is this correct?</description>
		<content:encoded><![CDATA[<p><i>looking at the right data trumps picking the right data mining algorithm</i></p>
<p>Just to clarify: We&#8217;re not talking about good feature selection, are we?  (i.e. just picking the right attributes out of whatever data we currently have.)  </p>
<p>And we&#8217;re also not talking about intelligently-initialized machine learning algorithms (i.e. what some might call &#8220;structured&#8221; learning, or initializing your learning algorithm with domain-dependent knowledge so as to guide the learning algorithm into the best task-specific models.)</p>
<p>Instead, we&#8217;re talking about simply identifying new sources of data?  Is this correct?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/comment-page-1/#comment-4306</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Sun, 30 Aug 2009 23:04:53 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2500#comment-4306</guid>
		<description>I think it&#039;s worth quoting from the post: &quot;significant AI breakthroughs come from identifying or creating new sources of data, not inventing new algorithms.&quot; He&#039;s not arguing that data scale is everything, but rather that looking at the right data trumps picking the right data mining algorithm. And I&#039;m generally in agreement with him there.</description>
		<content:encoded><![CDATA[<p>I think it&#8217;s worth quoting from the post: &#8220;significant AI breakthroughs come from identifying or creating new sources of data, not inventing new algorithms.&#8221; He&#8217;s not arguing that data scale is everything, but rather that looking at the right data trumps picking the right data mining algorithm. And I&#8217;m generally in agreement with him there.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jeremy</title>
		<link>http://thenoisychannel.com/2009/08/30/blogs-i-read-chris-dixon-cdixon-org/comment-page-1/#comment-4305</link>
		<dc:creator>jeremy</dc:creator>
		<pubDate>Sun, 30 Aug 2009 22:40:42 +0000</pubDate>
		<guid isPermaLink="false">http://thenoisychannel.com/?p=2500#comment-4305</guid>
		<description>&lt;i&gt;To make smarter systems, it’s all about the data&lt;/i&gt;

Do you believe it?  

Personally, I think statements like this are true, for certain limited domains or application types.  Home page finding, factoid lookup.  Those are areas where large data is very useful in web search.  

But when your information need is exploratory, I have a difficult time seeing how large data will help.  By definition, you have an information need or task that is orthogonal to the direction that the large data is pointing.  Exploratory search runs against the large data grain, not with it.</description>
		<content:encoded><![CDATA[<p><i>To make smarter systems, it’s all about the data</i></p>
<p>Do you believe it?  </p>
<p>Personally, I think statements like this are true, for certain limited domains or application types.  Home page finding, factoid lookup.  Those are areas where large data is very useful in web search.  </p>
<p>But when your information need is exploratory, I have a difficult time seeing how large data will help.  By definition, you have an information need or task that is orthogonal to the direction that the large data is pointing.  Exploratory search runs against the large data grain, not with it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

