The Noisy Channel


Playing With Wolfram Alpha

May 7th, 2009 · 18 Comments · General

Woo hoo, I have preview access to Wolfram Alpha! I’ve only had a short time to play with it, but I can already report that my experience confirms my previously expressed expectations: the NLP is very brittle, but there’s great potential for structured queries on quantitative data. Here is an example use case that, in my view, shows Wolfram Alpha’s strengths:

Wolfram Alpha

This bit of analysis tells a great story: Microsoft has almost three times as much revenue as Google, but Google has about 50% higher revenue per employee. Meanwhile, Yahoo is in third place on revenue,  number of employees, and revenue per employee. Ouch.

As I said, this query shows Wolfram Alpha favorably. What you don’t see are the false starts it took me to get this query to work. The NLP interface, in my view, is a really bad idea. Instead, Wolfram Alpha should be helping users generate good structured queries–and, better yet, helping other businesses build such queries through APIs. Wolfram Alpha could deliver an excellent plug-in for Excel, if they can expose a workable query API. I have no idea whether the company is able or willing to go down this path, but I hope someone there is listening to this free advice.

I can’t share my account, but I’m willing to take suggestions for queries through the comment thread, and I’ll try my best to share what I learn.

18 responses so far ↓

  • 1 Joe Mako // May 7, 2009 at 5:38 pm

    That looks great for the current year. How would the query need to be structured to see a line graph of those 3 numbers by year for the last 10 years?

  • 2 Rangachari Anand // May 7, 2009 at 6:29 pm

    Looks interesting. Try car accidents in the US correlated with population density.

  • 3 Gene Golovchinsky // May 7, 2009 at 9:01 pm

    Seems like a good application for YQL to make it easier to mash up the data and to treat them uniformly in a programmatic way.

  • 4 Daniel Tunkelang // May 7, 2009 at 11:22 pm

    Joe: no luck on the time series–I can only get that for simple queries like “microsoft revenue 1998-2008”.

    Anand: I can get “united states car accidents” and “united states population density”, but no correlation–or even a break down of either statistic by state.

    Gene: I agree, this screams for programmatic access.

  • 5 jeremy // May 8, 2009 at 12:47 am

    I suddenly feel like I should be relearning Prolog…

  • 6 Ramaseshan // May 8, 2009 at 12:53 am

    Is it possible find results for queries such as “who is the lead researcher in faceted browsing”?

  • 7 Max L. Wilson // May 8, 2009 at 3:43 am

    Ramaseshan’s sounds good. I was pleasently surprised last time i searched google for ‘Improving search interfaces

    have you also tried adding (Endeca revenue/endeca employees)? for interest’s sake. 😉

  • 8 Daniel Tunkelang // May 8, 2009 at 8:46 am

    Remaseshan: no luck–and not surprising, as they specialize in objective data. They’d have to quantify “lead”. Plus I doubt their curated data has anywhere near the completeness to include impact factors for HCIR researchers!

    Max: Endeca is privately held, so we’re not in any of the databases they use. I know–but if I told you, I’d have to kill you. Bad for my readership numbers!

  • 9 Perry Hewitt // May 8, 2009 at 9:43 am

    This is fascinating stuff — agreed on the programmatic access. I know researchers who would love to use this with census information. And marketers, should a tool like this fall into their evil hands, might like it to talk to their dbs …

  • 10 Daniel Tunkelang // May 8, 2009 at 3:11 pm

    I hope they have the sense to deliver the programmatic access that researchers and marketers need. They seem to be torn between two incompatible directions. On one hand, their computational focus could really appeal to information professionals. On the other hand, their NLP interface seems aimed (misguidedly, in my view) at casual web searchers. I hope they go in a direction that plays to their strengths, and that they can package it up to be useful to folks would would appreciate a computational engine. But their lack of clarity of purpose thus far is a bit discouraging.

  • 11 Nate Treloar // May 8, 2009 at 4:01 pm

    I’d be surprised if the Wolfram folks are not thinking in terms of a platform with the APIs and programmatic access, as opposed to just a destination for casual users. That would be the ‘wikinomical’ thing to do.

    Daniel, what does Wolfram do with queries like “koala”, “titanic”, or “daniel tunkelang”?

  • 12 Daniel Tunkelang // May 8, 2009 at 4:13 pm

    Nate, great to see you at The Noisy Channel! I agree that they should be thinking this way; but they really seem to be pushing the NLP angle. And I haven’t heard a peep about plans for an API. From this distance, it feels a like a science project that needs some adult supervision.

    Anyway, to your queries:

    Koala has two associated result pages: the species (default), which shows the complete taxonomy of Phascolarctos cinerius, and see-also to for the dictionary / thesaurus entry.

    Titanic offers a “coming soon” for the disaster event (default), and see-alsos to an IMDB-like entry for the movie and a dictionary / thesaurus entry.

    No luck with my vanity query. Some people are in their database, but I’m clearly not famous enough. 🙂

  • 13 Diamonds // May 11, 2009 at 2:02 pm

    Looks very interesting, when do they intent to release this?

  • 14 Daniel Tunkelang // May 11, 2009 at 3:36 pm

    In the next week, according to the New York Times.

  • 15 Christopher // May 11, 2009 at 10:41 pm

    Using a structured query with selective natural language support (synonyms for subjects, operators & predicates) they could do something like this for the above query.

    COMPARE REVENUE of Microsoft, Yahoo & Google by NUMBER of EMPLOYEES.

  • 16 Daniel Tunkelang // May 12, 2009 at 9:19 pm

    Indeed, I’d love it if the response to my query helped me learn how to better structure future queries to avoid the travails of trial and error. And, of course, I’d appreciate a documented user guide / API.

  • 17 Wolfram Alpha - Re-Inventing the Command Line - Things On Top // Jun 12, 2009 at 3:00 am

    […] overnight, especially since we’re already so accustomed to the quick and dirty web search. Wolfram Alpha can be both tedious and puzzling to interact with, and must therefore provide some serious extra value if it expects us to make the […]

  • 18 Wolfram Alpha Answers Its Own Questions… | Search Done Right // Feb 28, 2010 at 11:25 pm

    […] answer a small portion of the questions asked of it. A lot has been said about its limitations in NLP but I actually see WA’s limitation more related to the data – how it collects it and how it […]

Clicky Web Analytics