If you’re a search engine junkie like me, you’ve probably heard about Blekko, a search engine that has been percolating for over two years and recently launched a private beta. If not, I encourage you to watch the TechCrunch video I’ve embedded above. You can join the beta by following them on Twitter. I did that earlier this week, and my invitation arrived via a direct message the next day.
Blekko’s main differentiating feature is that it supports “slashtags”. These aren’t the same as the Twitter microsyntax proposed by Chris Messina and named by Chris Blow. Rather, they are a way for users to “spin” their search results using a variety of filters. For example, [climate /liberal] and [climate /conservative] return very different results, because they are restricted to different sets of sites.
In addition to providing a set of curated slashtags, Blekko allows users to define their own slashtags by specifying the sets of sites to be included. There’s a social aspect here too: you can use (and follow) other users’ slashtags. Blekko also has some special slashtags that don’t act as site filters, e.g., /date shows recent results and /seo offers indexing information about web sites.
Blekko emphasizes two characteristics that I find very appealing: transparency and user control. While they do not disclose their relevance ranking algorithm, they do expose some of the information they use to compute it. More significantly, their emphasis on slashtags de-emphasizes default ranking, but rather encourages users to take more responsibility in the information seeking process. Very HCIR!
I like the concept. But I’m not sure how I feel about the execution. I have three main concerns.
First, the set of slashtags is somewhat haphazard–to be expected in a beta, but I’m not sure how it will evolve. I’d love to see a vocabulary collectively (and transparently) curated like Wikipedia, but I fear it will look more like social tagging site Delicious, which is a case study in the “vocabulary problem“. As any information scientist can tell you, managing vocabularies is hard!
Second, I’m not sure if site filters are the right model. What happens to sites with heterogeneous content? Or to sites that have one-hit wonders and therefore are unlikely to show up in any slashtags? I’d prefer to see the sites used as seeds to train classifiers that could then be applied to the entire index. Something a bit more like what Miles Efron implemented in this research–only on a much larger scale and applied at a page rather than site level.
Third, I think there’s a third ingredient that is essential to complement transparency and user control: guidance. As a user, I need to know what slashtags would lead me to interesting results, and ideally I’d want some kind of preview to make exploration as low-cost as possible.
I know I’m asking for a lot–especially from an ambitious startup that has just launched its private beta. But I think the stakes are high in this space, and going easy on a newcomer is no favor. I offer the tough love of a critic who would really like to see this kind of vision succeed.