The Noisy Channel

 

Financial Times + Endeca = Newssift

March 18th, 2009 · 18 Comments · General

Now that the cat is out of the bag, I’m proud to tell readers here about an effort I’ve been involved with over the past few months. As reported in TechCrunch and Search Engine Land,  the Financial Times just launched Newssift, a semantic search engine, powered by Endeca, that sifts through business news. Regular readers may recognize the application from an example I used in my presentation on exploring semantic means.

I’m very excited about the Financial Times embracing exploratory search, and I think Erik Schonfeld from TechCrunch gets it right when he notes that the site “employs several subtle navigational techniques that make it more of a discovery engine than a search engine.”

Sorry to keep you guys in the dark about this for so long, but I hope it was worth the wait.

Here is some of the news coverage of the launch (yes, I’m shamelessly plugging this):

18 responses so far ↓

  • 1 Christopher // Mar 18, 2009 at 9:03 pm

    Let me be the first to say congratulations. 🙂

    The UI is a great example of the power of exploratory search. For doing research on specific news topics it’s going to add a lot.

  • 2 Jon // Mar 18, 2009 at 11:51 pm

    Wow — it looks fantastic. Incredibly rich exploratory experience. Congratulations.

  • 3 Daniel Tunkelang // Mar 18, 2009 at 11:59 pm

    Thanks, guys! I wish I could take more credit for it–my role was fairly small. The site reflects a collaboration between Endeca and FT Search–which is basically a startup within the Financial Times. It’s been great working with them.

  • 4 Christopher // Mar 19, 2009 at 12:21 am

    Don’t be too modest, it’s your initial hard work at the centre of this thing. 🙂

    I’d be curious if FT Search agrees to hear about the system flow, a simple architecture overview of the components used to create the tool.

  • 5 Joe Cardillo // Mar 19, 2009 at 1:39 am

    Quote from Erik Schonfeld puts it nicely. Congrats, it’s a pretty neat launch and I can see it being useful in a variety of ways.

    Any insight into how the sentiment measurement works? For example, looking at all of news articles on Reuters about Obama in a fixed time range, they are overwhelmingly negative and neutral, rarely positive. What types of criteria are used to make that judgement?

  • 6 MarkH // Mar 19, 2009 at 5:26 am

    Nice work.
    Some initial thoughts from a brief look:

    * Refering back to a previous post of yours – Obama is your Coldplay. He’s everywhere, suggesting the algo could do with some tweaks.

    * Key themes are thrown-off by non-content parts of pages e.g. navigational text like “Related articles”. You need to work on determining content and a site’s “chrome”.

    * I saw duplicate names in “Person” for the same person.

    * When you show matches it would be useful to show highlight snippets – i.e. exactly where in the articles was Obama associated with Woolworths?

    Don’t wish to seem too negative though. A nice job overall

  • 7 Steve Gurney // Mar 19, 2009 at 7:22 am

    Hi Daniel

    It’s funny to see this, especially with the FT. I remember in 2004 Jesse etc building prototypes for FT.com to show them this type of functionality with Endeca and the potential it could bring. I’m glad they saw the light eventually.

    I’m surprised that TechCrunch didn’t pick up on the effect this would have on advertising revenues. Exploratory search has a dramatic impact on page views.

    Steve

  • 8 Daniel Tunkelang // Mar 19, 2009 at 8:47 am

    Thanks for the feedback, both flattering and critical.

    Joe, document-level sentiment is through a partner–we’re just aggregating across sets. As for the Obama coverage not being all that positive, it’s been getting increasingly negative over time. The sentiment analysis isn’t perfect–particularly in complex stories–but I think the trend reflect reality.

    Mark, I like the Obama as Coldplay analogy. It’s not clear how best to handle a topic that has media saturation. And there’s room for improvement in a bunch of places, including the ones you cite. I’m glad these guys understand iterative delivery.

    Steve, great to see you at The Noisy Channel! Yes, this was a long time coming, but worth the wait. And you’re right that the tech press did overlook the stickiness angle. But I can’t complain too much: it’s not everyday that an enterprise software company like Endeca makes it into TechCrunch, and the coverage was not only positive but informative.

  • 9 ken // Mar 19, 2009 at 8:59 am

    Very nice, I’m impressed. Re Obama and negative sentiment, I had to laugh. I got some complaints from clients about Daylife’s sentiment algorithm and Obama coming out too negative. “How could this be? Everybody loves him. Something must be wrong with the algorithm”. Turns out, not everyone does. And with the economy in such a bad state, the negative sentiment tends to rub off on him, both for the algorithm and for humans.

  • 10 Daniel Tunkelang // Mar 19, 2009 at 9:14 am

    At least he didn’t rip off a song from Joe Satriani. 🙂

  • 11 Reverse Engineering NewsSift « NP-Harder // Mar 19, 2009 at 11:37 am

    […] sentiment, in a way entirely analogous to what we do at Daylife (although only through our API).  According to Daniel Tunkelang over at the Noisy Channel, the analysis is done by another firm.  There are a few out there, […]

  • 12 Daniel Lemire // Mar 19, 2009 at 2:11 pm

    Nice!

  • 13 Amy Grabowski // Mar 19, 2009 at 4:31 pm

    Thanks for the mention and a terrific collaborative partner in Endeca! We’re encouraged by the reception of the beta and appreciate all feedback as we continue to refine the site (and work out the kinks) and refine Newssift into a useful business tool.

  • 14 Aaswath // Mar 20, 2009 at 1:05 am

    Oh hey, person, place and organizations! This reminds me of something 🙂 Congrats Daniel!

  • 15 Thomas Kjelsrud // Mar 20, 2009 at 7:38 am

    Although very cool technically and with cool features like sentiment, I did not find it that user friendly. The way it reloads the page and stacks criterias made it difficult for me to “reset” and start searches from scratch. It has strolled away somewhat from the easiness of pages like Google.com. Just some thoughts on otherwise great work.

  • 16 Newssift » Blog Archive » Newssift Beta Launch // Mar 23, 2009 at 9:35 am

    […] Noisy Channel Financial Times + Endeca = Newssift This entry was posted on Friday, March 20th, 2009 at 8:14 am and is filed under News + Events. […]

  • 17 Guy Valerio // Mar 25, 2009 at 8:18 am

    @Steve Gurney
    Hi Steve. Yes, Jesse did some excellent work for us in 2004 as did FAST and SOM. We understood faceted classification back then; in fact that is what we specified 😉

    Daniel – great blog,
    Guy

  • 18 Daniel Tunkelang // Mar 25, 2009 at 12:15 pm

    Guy, thanks! I’m delighted by what FT and Endeca built together in such a short time frame, and there’s a great backlog of ideas for next steps.

Clicky Web Analytics