The Noisy Channel

 

A Scaling Challenge for Twitter Search

March 15th, 2009 · 8 Comments · General

The other day, I explained why, as far as I can tell, Twitter’s existing search functionality isn’t that hard to implement. In a subsequent post, I argued that Twitter is not a search engine, an opinion that seems to place me in a minority of the blogosphere, albeit a substantial one.

Today I see that Twitter is have some trouble with what strikes me as a core constituency, SXSW attendees.

Daniel Terdiman at CNET writes that “At SXSW, attendees confront Twitter saturation“:

At SXSW, the standard is for everyone to include the tag “#sxsw” in their tweets. For example, on Friday, I was looking for sources for a different story and tweeted, “If you are launching an iPhone app at #sxsw, or know someone who is, please let me know. Thanks!”

That’s a great convention because it allows anyone wanting to know what’s going on to search Twitter for posts using any search term important to them.

I did a search for the “#sxsw” tag on Saturday afternoon and found that there had been 392 tweets with the term in just the previous 10 minutes. That number mushroomed to more than 1,500 in the previous hour.

Large volumes of results wouldn’t be such a problem if users had a way to summarize, navigate, and explore them. But that will take more than a search engine that offers more than reverse-date ordering of Boolean queries.

I wonder if Abdur Chowdhury and his team are working on this problem. Perhaps if Twitter is willing to make some of the historical logs available for download (they’re already public, just not easily downloaded), some of us HCIR wonks could implement interfaces on top of it to explore the possibilities. Abdur, if you read this and are interested, please let us know!

8 responses so far ↓

  • 1 Mark // Mar 15, 2009 at 2:41 pm

    I build isitrocking.com to sift through the noise at SXSW. It’s pretty helpful.

  • 2 Mark // Mar 15, 2009 at 2:42 pm

    built. I built it.

    ack.

  • 3 Daniel Tunkelang // Mar 15, 2009 at 2:48 pm

    Cool–nice to see text mining apps working on top of the Twitter stream. Is this basically a focused version of Twitter’s trending topics, with some sentiment mining thrown in?

  • 4 Mark // Mar 15, 2009 at 3:31 pm

    Yeah, it looks for hashtags for event titles, and then tries to guess the sentiment of each Tweet.

  • 5 Abdur // Mar 15, 2009 at 11:49 pm

    We have been exploring the idea of research collections but have not really had the bandwidth to fully work out all the logistics of such a project.

    Summarization for some topics is one area that could help users, but I can envision many other ways to slice and dice the data being produced.

  • 6 Arturo Servin // Mar 16, 2009 at 7:22 am

    I agree with you about the twitter search. You made think about how to make it a real search engine. One problem is that there is no ranking at all. IMHO, without some kind of ranking, the functionality is very poor.
    I am working on a project about that, hopefully someday I would finish it.

  • 7 Daniel Tunkelang // Mar 16, 2009 at 8:47 am

    Abdur, I know you’ve had mixed experiences with releasing such research collections. But I hope that, with the data already public and searchable, there really aren’t any privacy issues to sidestep. But I can imagine that you guys are a bit short-staffed to handle the logistics of publishing a large collection that would attract enormous attention.

  • 8 An Able Grape at the Helm of Twitter Search « AltSearchEngines // Aug 13, 2009 at 4:08 pm

    […] link-baited Summize founder and Twitter Chief Scientist Abdur Chowdhury here once or twice, but I understand that he’s no longer running Twitter Search. They’ve got a new guy, Doug Cook, […]

Clicky Web Analytics