The Noisy Channel

 

Real-Time But Not Ready For Prime Time

June 18th, 2009 · 12 Comments · General

Extra, extra, read all about it–two new real-time search engines debuted today: CrowdEye and Collecta.

I love the headlines from Techmeme:

Yes, folks, it’s really, really, real-time! Of course Twitter and Facebook have their own real-time search offerings. And apparently Google, Yahoo, and Microsoft are looking hard at real-time too.

I concede that there’s something in this real-time mania. I’ve live-tweeted events, and I’ve followed others who were doing so. I certainly read current news and blogs–as they say, today’s newspaper wraps tomorrow’s fish (someone will have to translate the expression for folks who’ve never read an analog newspaper). But yes, recency / freshness  is a certainly a concern in information seeking.

But it’s not the only one, and I doubt it’s the dominant one. Moreover, the dismissal of web search engines as if their index contents are ancient history is preposterous. Search for iran election on Google, Yahoo, or Bing, and you see a lot of current news. I suppose Twitter offers more recently generated bits, but the main virtue there is not the immediacy–rather, it’s the social nature of the content. For example, a number of people are following @persiankiwi for a personal perspective. I’ll let you decide for yourselves if Collecta or Crowdeye offer something new or valuable–I’m still waiting for the former to show me anything at all!

I know that the technology press likes new buzzwords, and “real-time” search is surely the buzzword du jour, even giving “semantic” search a run for its money. And I understand how many in the blogosphere feel it is their moral duty to cheer on any start-up that makes a go at disrupting the current regime. But I wish these folks would evaluate the new entrants on their merits, rather than simply on the drama of the David vs. Goliath story.

I understand what it’s like on the startup side–it wasn’t that long ago that few people outside the Boston-area technology scene had heard of Endeca. For a long time, I was jealous of people whose companies had generated more buzz. But, in retrospect, I’m at least glad that my colleagues and I had a chance to build a robust product before the press noticed us. Overenthusiastic press isn’t necessarily a good thing, as I’m sure a line-up of prematurely crowned Google killers can attest.

In that spirit, I hope that CrowdEye and Collecta bring something interesting to the market. But I doubt that “real-time” search will cut it, especially if it’s not ready for prime time.

12 responses so far ↓

  • 1 Arne van Elk // Jun 18, 2009 at 7:04 pm

    Wise and necessary observation. Just checked Collecta myself, and though it looks interesting, it’s not delivering yet

  • 2 Christopher Rines // Jun 19, 2009 at 2:03 am

    Let me start by saying I wish these startups luck. I’m in the middle of it so I know how hard it is to build something and carve out a niche…

    However…

    The value of real-time streams decreases over time making a real-time search engine not all that useful; a normal search engine will do perfectly well, possibly better as it will theoretically group multiple data sources together by relevance rather than just the real-time stream(s).

    Now better real-time filtering, that’s something of use.

  • 3 jeremy // Jun 19, 2009 at 12:10 pm

    Yeah, that’s my question.. “search” vs. “routing and filtering”. Do people really need to search in real time? Or do they need real-time “channels” (routes/filters/whatever)?

    It seems to me that the latter would be useful. But the former?

  • 4 Daniel Lemire // Jun 19, 2009 at 2:44 pm

    Are you kidding me?

    I am in academia. For my colleagues, real-time collaboration means sending emails with attached word documents.

    And many colleagues print out their emails so that they can read them on paper.

    We just recently started to archive (poorly) meeting notes on some web site. These documents are not even indexed! And I am pretty sure my colleagues do not even know that you can index them without exposing them to Google. OH? And it generally takes weeks if not months before the documents get posted.

    I don’t doubt some people need real-time, but my business (academia) does not need real time. I am quite happy to be a few weeks or months late on the latest buzz. Heck! I don’t even mind being a few years behind.

    BTW Facebook still does not offer RSS feeds or any kind of notification for its posting boards.

  • 5 Daniel Tunkelang // Jun 19, 2009 at 2:55 pm

    Facebook is a walled garden–I think that’s a separate problem.

    I do see value in news / blog alerts (routing / filtering if you prefer those terms), though I don’t really care if it’s “real time” or 24 hours delayed. RSS works well at the scale I care about, e.g., my vanity queries. I would hardly use RSS for alerts about Iran. There what I want is summarization and exploratory search. I do care about freshness, but today’s online news sites are fresh enough for me. I don’t see the value in reading the latest tweet–I’m happy to wait a little for some aggregation and de-noising.

  • 6 James ostheimer // Jun 19, 2009 at 11:08 pm

    I have to say that I agree with this analysis, just realtime is not useful in all contexts. It seems that most places are going with the model of supplementing real time results with crawling shared urls, I’ve been trying it the other way using regular search results supplemented with realtime information at http ://www.re-searchr.com. Not sure about how well it works but I like this concept better than the status quo.

    James

  • 7 Daniel Tunkelang // Jun 20, 2009 at 1:07 am

    James, thanks for the link–that’s an interesting approach. Another way to get something along those lines is to use a Greasemonkey script to merge Twitter results into those from Google: http://userscripts.org/scripts/show/43451

    Or, for those who prefer Bing: http://userscripts.org/scripts/show/50665

  • 8 James Ostheimer // Jun 20, 2009 at 2:00 pm

    Those scripts are great, but they basically just slap Twitter results at the top of the search page, which is pretty similar to what Google does for news, and works okay.

    I’ve been trying to actually evaluate all the results together and rank them; realtime and old style search, together. My thought is wouldn’t it be nice to do a search and get the best real-time results mixed in with the best deep web search results with logical rankings (most recent and most well shared will bubble up, but older popular links maintain value).

    James

  • 9 Daniel Tunkelang // Jun 21, 2009 at 6:55 pm

    Agreed, you goal is more ambitious. But your interface is a bit overwhelming. The script approach may err too far on the side of simplicity, as you point out, but I think you need to aim for something a bit less complex if you want significant user adoption.

  • 10 Can Real-Time Search Help Hedge Funds? | The Noisy Channel // Jun 25, 2009 at 9:33 am

    […] haven’t exactly been generous in my opinons about the widespread obsession with “real-time” search. But in today’s Telegraph […]

  • 11 Paid Content : The Inside Word: Why Real-Time Search Is Overrated // Jun 26, 2009 at 11:41 pm

    […] “Yes, recency / freshness is certainly a concern in information seeking,” Tunkelang writes. “But it’s not the only one, and I doubt it’s the dominant one. Moreover, the dismissal […]

  • 12 The Inside Word: Why Real-Time Search Is Overrated — paidContent // Apr 4, 2012 at 4:13 pm

    […] “Yes, recency / freshness is certainly a concern in information seeking,” Tunkelang writes. “But it If you like this story, please share […]

Clicky Web Analytics