Categories
General

Fun with Google, Bing, and Yahoo

Web search is a fiercely competitive space–as Google points out, “competition is just one click away“. In practice, I take that claim with a grain of salt–but I do think the switching costs are much lower than in most competitive markets. With that in mind, it’s interesting to look at what happens if you search for the name of one of the major search engines on one of its competitor’s sites.

Google returns standard results for such searches:

[bing] on Google

 on Google

Bing is generous to a fault, saving you a click if you choose to use one of its leading competitors:

[google] on Bing

 on Bing

Finally Yahoo, whose CEO claims “we have never been a search company,” seems quite eager to keep searchers from going elsewhere:

[bing] on Yahoo

[google] on Yahoo

It’s easy to dismiss these queries as corner cases, but the logs show that they really happen. And, as browsers increasingly blur the line between an address bar and a search box, it’s not unreasonable to consider that switches between search engines are likely to commence with such queries.

Categories
General

Marti Hearst: Tech Talk on Search User Interfaces

Earlier this week, Marti Hearst gave a Tech Talk at Google about her recently published book, Search User Interfaces. Fortunately for those of us who missed (myself included!), it is now available on YouTube. Enjoy! (via Jon Elsas)

Categories
General

Can We Learn From Anti-Social Users?

One of the interesting challenges we face as both both developers and consumers of search technology is that social signals are a double-edged sword. On one hand, social signals have proven essential in distinguishing signal from noise–be they links, re-tweets, or any number other ways that online consumers (or more correctly “prosumers”) actively and passively communicate value judgments about information. On the other hand, our reliance on these social signals makes us vulnerable to positive feedback and spammers.

Consider MusicLab, an “experimental study of self-fulfilling prophecies in an artificial cultural market“. In this study, sociologists Matt Salganik, Peter Dodds, and Duncan Watts manipulated the social information available to consumers (specifically teens) regarding their peers’ musical tastes. The experimenters’ goal was to empirically validate a quantitative model of social contagion.

But we can look at this study another way: by isolating the social factors that influence musical taste, the experimenters were also isolating the non-social signal–in theory, how popular a song would be in the absence of social signaling. Indeed, they found that, if they measured a song’s quality by isolating out the social factor, “the best songs never do very badly, and the worst songs never do extremely well, but almost any other result is possible”.

It’s interesting–interesting to me, at least!–to ask if search engines can do the same for search. One of the frequent objections to link-based authority measures like PageRank is that they make the rich get richer. “Real-time” variants like re-tweet frequency (and even TunkRank) suffer from the same weakness. Unchecked, these measures can cause authority / influence market has to resemble a winner-take-all market.

It strikes me as interesting to learn from cases where searchers swim upstream against the social signals to find information. Of course, you may already see the contradiction–this is just another kind of social signaling! Still, it seems like it might be a way to hedge our bets and against the weaknesses of positive feedback and spammers. In a similar vein, we might look at how users find information that suffers from poor accessibility or retrievability.

I don’t have answers about how to pursue such an approach, or whether it would even be feasible to do so. But I hope you agree with me that it’s an interesting question.

Categories
General

Exploring Exploratory Search

Google’s recently released Image Swirl is slick. But I’ve been struggling to figure out whether it’s useful or simply a showcase for cool technology.

And that’s prompted me to think about the overloaded term “exploratory search“. A while back, I tried to define exploratory search based on what it is not. This time, let me aim to positively characterize what I see as its two primary use cases:

  1. I know what I want, but I don’t know how to describe it.
  2. I don’t know what I want, but I hope to figure it out once I see what’s out there.

The first use case cries out for tools that support query refinement or elaboration. Existing tools span a range from suggesting spelling corrections (aka “did you mean”) to offering semantically or statistically related searches that hopefully provide the user with at least a step in the right direction. One of my favorite approaches, faceted search, is primarily used to support query refinement through progressive narrowing of an initial search query.

The second “I don’t know what I want” use case is fuzzier. In the language of machine learning, this use case is unsupervised, while the previous one is supervised. In general, it’s a lot harder to define or evaluate outcomes for unsupervised scenarios. Indeed, Hal Daume has argued that we should only do unsupervised learning if we do not have a trustworthy automatic evaluation metric. That’s a strong position, and you can see some of the counterarguments in his comment thread. But, going back to our scenario, it’s really hard to judge the effectiveness of tools like similarity browsing when they support exploration in the absence of any concrete goal.

With that in mind, I’ll reserve judgment on the utility of tools like Image Swirl. To the extent that it aims at the first use case, clustering images for a particular search, I’m ambivalent. I’d prefer a more transparent interface, in which I have more of a sense of control over the navigational experience. I suspect it is more aimed at the second use case, offering a compact visualization of what is out there.

Besides, as some folks have brought up at the HCIR workshops, it’s important that we make information seeking fun. And Swirl certainly scores on that front.

Categories
General

An Ad-Supported Model With Teeth?

A computer-implemented method for operating a device, the method comprising:
disabling a function of an operating system in a device;
presenting an advertisement in the device while the function is disabled;
and enabling the function in response to the advertisement ending.

So reads the first claim from a patent application that Apple recently filed (with Steve Jobs as first inventor, no less!) for technology to deliver a rather compelling ad-supported business model. Or perhaps the better word is compulsory. You can read an analysis by Randall Stross in the New York Times.

I agree with Stross that it’s hard to imagine Apple ever implementing the technology described by the patent application–indeed, Apple has been one of the few success stories for paid digital content models. That said, the approach does feel like at least one endpoint for the ad-supported model–it guarantees the advertisers the attention that they are paying for by subsidizing content or services.

The advertising business is a bit more top of mind for me, now that it pays my salary. Google’s approach, however, follows the aphorism that honey catches more flies than vinegar: it tries to target ads well enough that users want to click on them, rather than to simply endure them as a cost of subsidizing free services. Google’s revenue (and the popularity of PPC models in general) is a testament to the success of this approach, my occasional rant notwithstanding.

In general, the industry seems to have found a compromise in how aggressively to push ads at users. Users can safely ignore (or even block) sponsored links, but few people do.  Pre-roll ads on video sites (i.e., advertising before a video starts)  are more invasive, but a number of sites let users skip them. You can read why the YouTube folks are testing this approach. Advertisers–or at least ad-supported services–seem to recognize that they can’t cross the line between pursuing users’ attention and annoying users to the point of alienation.

Still, technology like Apple’s patent application describes shows that it is possible for the ad-supported model to take a more more aggressive approach. Part of me wonders if more aggressive ad-supported models would revitalize paid content models, as users would stop perceiving the former as free. But I suspect that the gentler ad-supported model is here to stay, and that it will continue to strive toward the point of optimal effectiveness.

Categories
Uncategorized

Call for Speakers: Enterprise Search Summit 2010

I’m no longer in the enterprise search business, but I know that many readers here are. If you are one of those readers, then I strongly encourage you to consider participating in the Enterprise Search Summit, which will take place next May in New York. I presented there last year and enjoyed the opportunity to meet fellow presenters and attendees. You can read my recap of the event.

The deadline for proposal submission is November 30th–you only have to submit a 250-word abstract.

Here is the call for proposals:

We seek dynamic speakers who can talk knowledgeably about detailed aspects of how to implement and maximize search within an organization. Search can no longer be viewed as a stand-alone application. It is increasingly part of everything we do and has become the de facto gateway to information in the enterprise. This year’s Summit will examine the ways to leverage search tools, information architecture, classification, and other strategies and technologies to deliver meaningful results—not just in terms of information, but to the bottom line.

Ours is a well-informed, tech-savvy audience, so proposals should be specific and detailed. Consider topic such as:

  • Integrating search into enterprise systems and workflow
  • Customizing your search solution/ Task-specific search
  • Compliance, records management, and eDiscovery with effective search
  • Migrating your search engine
  • Social search and social tagging strategies & solutions
  • Search-enabled decision making
  • Business intelligence, data mining
  • Search as the gateway to enterprise information
  • Optimizing the interface and user experience
  • Navigational tools—context, facets, entity extraction, clustering, and visualization
  • Emerging trends, the future of search
  • Overcoming information overload
  • Categorization techniques
  • Semantic Search
  • Query Federation & Federated Search
  • Enhancing an existing solution

If you represent a company that has an enterprise search software product, your best bet to be on our program is to collaborate with a customer to submit a case study to be presented by them, following the guidelines above.

If you need more information–or more time–I encourage you to reach out directly to Michelle Manafy, the conference chair.

Categories
Uncategorized

Week 1 at Google: Information Overload!

As you might imagine, it’s quite a switch to go from criticizing
Google from the outside to being on the inside. Jeff Jarvis, who was
gracious enough not to make fun of me in public, nonetheless admitted
to me privately that the news had made him chuckle.

As I finish my first week, I can sum the experience in a word:
overwhelming. The tools for accessing internal information are better
than I expected, but both the volume of baseline knowledge–technical
and cultural–and the relentlessness of the update stream are
daunting.

Indeed, the internal ecosystem is so rich that it’s easy to forget
there is a world outside it–ironic given Google’s enormous role in the world outside it! Then again, this is just my first week–it will take me
some time to pop up the stack from the build system to the surface.

Categories
General

The Noisy Noogler: A Quick FAQ

I’m barely 24 hours into my new life as a Googler, and I’ve already gotten lots of questions! Here at the answers to a few of them:

Will I continue blogging at The Noisy Channel?

Absolutely! I’m committed to posting at least weekly, and I’ll try to do better than that once I’m settled into my new environment.

Will I participate in scholarly conferences and workshops?

Of course! I’m co-organizing SSM 2010, which will be held in conjunction with WSDM 2010 in February, and of course HCIR 2010, which will be held in conjunction with IIiX 2010 in August. You probably won’t see me at vendor fests, but I do hope to continue bringing industry practitioners and academic researchers together.

Will I blog about Google?

I certainly won’t disclose any confidential information–people get fired for that–or worse. And, given how much access I will have to such information, I will err on the side of caution, only discussing information that I’m sure Google has released to the general public. Beyond that, I’ll exercise common sense. I don’t want to either come across as a shill for my employer or to spar with my new colleagues in public. Subject to those constraints, however, I can and will blog about Google.

Can I get you a job at Google?

I can advise you and connect you to a recruiter, but that’s the limit of my power. The hiring process here is specifically designed to prevent any individual from manipulating it–even me!

Will I talk about what I’m working on?

See above regarding confidential information. I’ll be delighted to talk about anything I’m working on that Google has decided to disclose publicly.

Does Google know about my karaoke habit?

Too late, they’ve already signed the offer letter. 🙂

Categories
Uncategorized

Apologies for Slow Response Times

I am without my own laptop for a few days as I manage a transition between jobs. So I apologize in advance if I am slow to respond to email, comments, or tweets over the weekend. I’ll be back at full steam early next week.

Categories
Uncategorized

Going (to) Google

McGoogle

This is my last week at Endeca. The decision to leave has been a heart-wrenching one: not only have the past ten years been the best of my life, but my experiences at Endeca have defined me professionally. Moreover, Endeca is riding a wave of success with recent advances in our products, new relationships with key partners, and fascinating new deployments.  (You can read Endeca’s latest announcements in our newsroom).

Ironically, it is this very success that compels me to move on. In the past several years, I have developed an increasing passion for search on the open web–an interest only furthered by the explosion of social media.

That is why I’ve decided to accept an opportunity at Google’s New York office. Readers here know that I’ve been a very public critic of Google’s simplistic approach to user interaction on the open web. I’m being offered an opportunity to help fix that approach–and it is an offer I can’t refuse. My mission is to apply my passion for human-computer information retrieval (HCIR), an approach that Endeca has pioneered in the enterprise, to the world’s largest information problems–and where better to do that than at the company that aspires to organize the world’s information.

This moment is bittersweet: I am excited about the new experiences that await me, but I have a heavy heart as I turn in my badge and part with a world-class team that has succeeded against incredible odds.

Given my role and tenure at Endeca, I want to say explicitly that this move is about my personal ambition. My passion for web search and social media, which have grown exponentially over the past couple of years, simply doesn’t align with Endeca’s focus in the enterprise.

Also, I want to make clear: Google hired me because of my values, and not in spite of them. I know that some folks will find it difficult to reconcile my criticisms of Google with my decision to join. That’s why there’s an opt-out village! Seriously, though, I take my values with me. Google is offering me the opportunity to channel my passion for HCIR into action, on the world’s largest stage. I’m well aware of the magnitude of the challenge, but hey, I’m feeling lucky.