Month: June 2009

Those Who Give Twitter Get Twitter

Post author By Daniel Tunkelang
Post date June 6, 2009
9 Comments on Those Who Give Twitter *Get* Twitter

Marshall Kirkpatrick at ReadWriteWeb wrote a post arguing that the people working at Twitter aren’t using the service the way its power users do, and that this bodes ill for Twitter. His main arguments:

Twitter’s employees don’t twitter very much: an average of 2 to 3 tweets per person per day.
Twitter employees don’t follow very many other people: only 2 out of 49 Twitter team members follow more than 500 people and no one was over 1k.
Twitter staff members aren’t following top Twitter developers in the community.

I can’t really address the third point, but the first two–and especially the second–are hardly helpful to Kirkpatrick’s case. To the contrary, they argue that the people who work at Twitter get it. And, to make sure Kirkpatrick got it, Twitter CEO Ev Williams even wrote him a letter, in which he said:

Many people fall into the trap that you should follow all or most people back out of a sense of politeness or so-called engagement with the community… At a certain point, you’re not actually reading any more tweets by following more people — you’re just dipping into the stream somewhat randomly and missing a whole lot of what people say. That’s fine, but I believe people will generally get more value out of Twitter by dropping the symmetrical relationship expectation and simply curating their following list based on the information and people they want to tune in to.

Amen! I’ve been hammering this point here in most of my posts about Twitter, but here is a handful of examples for newer readers:

And of course the whole point of TunkRank is to discourage the vicious circle of reciprocity and fake following. That’s baked into the the measure which, like PageRank, divides the voting power by the number of out-links.

The comments on Kirkpatrick’s post suggest that a lot of regular Twitter users also get it. I find that reassuring, especially given the hype around Twitter in the last several weeks. Twitter can be a useful tool but it will help if people don’t devalue it by imposing cultural norms that devalue the social network. I’m glad the folks who have given us Twitter realize that.

General

Enabling Exploration Through Text Analytics

Post author By Daniel Tunkelang
Post date June 5, 2009
7 Comments on Enabling Exploration Through Text Analytics

http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=enablingexplorationthroughtextanalytics-090605090248-phpapp02&stripped_title=enabling-exploration-through-text-analytics

As promised, here are my slides from the recently held Text Analytics Summit. Feel free to download them from SlideShare–some of the animation may not come through in this version (though I try to use such animation sparingly).

I enjoyed the conference, and was pleasantly surprised by the overall intellectual level of participants, who included a number of end-users of text analytics, as well as senior technologists from the leading text analytics vendors. Yes, there were sales and marketing people there too, and the occasional vendor fluff piece, but someone’s got to pay the bills. I believe that all of the presentations will be posted online in the next few weeks.

And, speaking of vendors, I hope to see some of you next week at Endeca Discover. I’ll be delivering an 80s-themed presentation entitles “money for nothing and your tags for free”.

Also, while I have your attention, I urge everyone to spread the word about SIGIR. If you are in industry and have little patience for academic conference, I still encourage you to consider the one-day SIGIR Industry Track. For $300 (compare that to any other industry conference!), you get a chance to hear and meet a star-studded line-up, including:

Matt Cutts, Google: Web Spam and Adversarial IR: The Road Ahead
danah boyd, Microsoft Research: The Searchable Nature of Acts in Networked Publics
Vanja Josifovski, Yahoo! Research: Ad Retrieval – A new Frontier of Information Retrieval
Thomas (Tom) Tague, Thomson Reuters: Semantic Web and the Linked Data Economy
Tip House, OCLC: Alexandria 2.0: Search Innovations Keep Libraries Relevant in an Online World
Panel of enterprise search analysts: Whit Andrews, Gartner; Susan Feldman, IDC; Theresa Regli, CMS Watch
Panel of enterprise search vendors: Øystein Torbjørnsen, FAST; Peter Menell, Autonomy; Adam Ferrari, Endeca (moderated by Elizabeth Liddy!)

I hope to see many of you there, and I also appreciate if you can spread the word, since SIGIR doesn’t traditionally market to industry professionals.

General

Google Squared: A Great First Step

Post author By Daniel Tunkelang
Post date June 4, 2009
23 Comments on Google Squared: A Great First Step

Regular readers know that I am not a Google fan boy, and that much of my commentary on Google focuses on their neglect of exploratory search. Nonetheless, when I saw the initial Youtubeware describing Google Squared a few weeks ago, my ears perked up. I decided to wait until it went live to assess it. Well, it’s live now.

The idea of Google Squared is simple: it “collects facts from the web and presents them in an organized collection, similar to a spreadsheet.” The best way to understand it is to try it. For example, search for hybrid car, and you’ll see a table of hybrids, with columns corresponding to image, description, type of transmission, yeah, and height. Add a price column if you’d like, and it will populate it for you. Very slick.

Of course, it is, as Google admits, “by no means perfect”. Most queries will show its warts, and some, like information scientists, are way off (it doesn’t even try to return results for library scientists). But it does pretty well when there is structured data out there, and it makes admirable attempt to find it! I suspect the real trick here is that it does a decent job of finding determining instances of the query category (perhaps a souped up version of work they started discussing back in 2004), and then mining structured content about those instances from repositories like Freebase.

I mean, look at these results:

To be clear, I picked these examples after a fair amount of trial and error–like Wolfram Alpha, it is hit and miss, with more miss than hit. But, as Seth Grimes said at the recent Text Analytics Summit, when Wolfram Alpha is good, it’s very very good, but when it’s bad, it’s horrid. Google Squared doesn’t fail quite so spectacularly, and it gives you a lot more of a chance to interact with it.

This is, by far, the best step I’ve seen Google take towards HCIR, and I’m impressed. It’s still a toy at this stage, but I think it has a future. My warmest congratulations to Daniel Dulitz and the rest of the magpie team that developed it; I’m looking forward to seeing it evolve.

General

Google Search Appliance Woos, But Does It Wow?

Post author By Daniel Tunkelang
Post date June 3, 2009
2 Comments on Google Search Appliance Woos, But Does It Wow?

Yesterday, Google announced the latest version of its search appliance, GSA 6.0, to great fanfare. As usual, their emphasis was on scale: they’re pushing a distributed architecture that lets them “push it to a new realm: billions”. It’s a nice sound bite, and it played well to the press.

The few analysts who commented about it were somewhat more critical. Matthew Brown from Forrester said, “They’re coming to market so late, with requirements that were established years and years ago. They’ve reached parity with where the market was four or five years ago.” Adriaan Bloem from CMS Watch was even harsher, assessing many of Google’s claims as exaggerated and requiring a complexity at odds with their positioning as a plug-and-play appliance.

Given my role at Endeca, I’m in no position to be objective. But I’ll share my impressions, which you can take with the appropriate grain of salt. We don’t encounter Google much as a competitor; FAST and Autonomy are still more likely to show up with us on prospective customers’ short lists. And, while I have met happy GSA customers, I’ve met many more enterprise buyers who scoff when I suggest the GSA as a candidate solution for them (yes, that’s why I’m not in sales). Also, my recent experience of seeing how Google positions the GSA was less than persuasive. There is still a widespread impression that Google is not serious about this market segment.

Of course, the market will decide, and a data-driven company like Google will surely track the success of its efforts quantitatively. But for now, I don’t feel that Google’s announcement has changed the competitive landscape. As always, I’m curious to hear others’ opinions.

Uncategorized

Faceted Search Book: Now Available Online!

Post author By Daniel Tunkelang
Post date June 2, 2009
5 Comments on Faceted Search Book: Now Available Online!

I’m delighted to report that my faceted search book is now available for online purchase at the Morgan & Claypool site! The printed version should be going out shortly (you can pre-order at Barnes & Noble or Amazon); the publisher assures me that there will be copies in time for SIGIR.

General

Banging on Bing: A Bummer

Post author By Daniel Tunkelang
Post date June 1, 2009
23 Comments on Banging on Bing: A Bummer

So, Bing is out early. Yes, an early release from Microsoft! And it’s snappy, attractive, and offers decent quality. If I needed to use Bing as my main search engine for the web (yes, readers, imagine a world without Google and Yahoo as search options). I’d survive.

But I can’t say I’d be thrilled. I’ve only had a short time to play with Bing, but I’m not overwhelmed. In fact, I’m quite disappointed, given their big talk about deliver a “decision engine“, I expected at least a little bit of innovation in the user experience. No such luck, The focus is still on the ranked list, and their ranking is, at least to my taste, perceptibly inferior to Google’s. I could live with that small difference if the interface offered real opportunities for interaction. But there isn’t anything new there. You can refine by result type (Web, Images, Videos, Shopping, News, Maps, Local, Travel), but search engines have been doing that for years.

The only novelty is “xRank”, which lets you “see who and what everyone’s searching for most”. It’s intriguing, but it seems half-baked, and I suspect that others are further ahead on crowd-sourcing relevance through the social stream.

I take no pleasure in throwing cold water on the queue of challengers that attempt to provide competition for Google in web search. Perhaps Bing is truly in beta, and will prove itself a more formidable challenger in the future. But it’s surely not there now.