Categories
Uncategorized

Call for Speakers: Enterprise Search Summit 2010

I’m no longer in the enterprise search business, but I know that many readers here are. If you are one of those readers, then I strongly encourage you to consider participating in the Enterprise Search Summit, which will take place next May in New York. I presented there last year and enjoyed the opportunity to meet fellow presenters and attendees. You can read my recap of the event.

The deadline for proposal submission is November 30th–you only have to submit a 250-word abstract.

Here is the call for proposals:

We seek dynamic speakers who can talk knowledgeably about detailed aspects of how to implement and maximize search within an organization. Search can no longer be viewed as a stand-alone application. It is increasingly part of everything we do and has become the de facto gateway to information in the enterprise. This year’s Summit will examine the ways to leverage search tools, information architecture, classification, and other strategies and technologies to deliver meaningful results—not just in terms of information, but to the bottom line.

Ours is a well-informed, tech-savvy audience, so proposals should be specific and detailed. Consider topic such as:

  • Integrating search into enterprise systems and workflow
  • Customizing your search solution/ Task-specific search
  • Compliance, records management, and eDiscovery with effective search
  • Migrating your search engine
  • Social search and social tagging strategies & solutions
  • Search-enabled decision making
  • Business intelligence, data mining
  • Search as the gateway to enterprise information
  • Optimizing the interface and user experience
  • Navigational tools—context, facets, entity extraction, clustering, and visualization
  • Emerging trends, the future of search
  • Overcoming information overload
  • Categorization techniques
  • Semantic Search
  • Query Federation & Federated Search
  • Enhancing an existing solution

If you represent a company that has an enterprise search software product, your best bet to be on our program is to collaborate with a customer to submit a case study to be presented by them, following the guidelines above.

If you need more information–or more time–I encourage you to reach out directly to Michelle Manafy, the conference chair.

Categories
Uncategorized

Week 1 at Google: Information Overload!

As you might imagine, it’s quite a switch to go from criticizing
Google from the outside to being on the inside. Jeff Jarvis, who was
gracious enough not to make fun of me in public, nonetheless admitted
to me privately that the news had made him chuckle.

As I finish my first week, I can sum the experience in a word:
overwhelming. The tools for accessing internal information are better
than I expected, but both the volume of baseline knowledge–technical
and cultural–and the relentlessness of the update stream are
daunting.

Indeed, the internal ecosystem is so rich that it’s easy to forget
there is a world outside it–ironic given Google’s enormous role in the world outside it! Then again, this is just my first week–it will take me
some time to pop up the stack from the build system to the surface.

Categories
General

The Noisy Noogler: A Quick FAQ

I’m barely 24 hours into my new life as a Googler, and I’ve already gotten lots of questions! Here at the answers to a few of them:

Will I continue blogging at The Noisy Channel?

Absolutely! I’m committed to posting at least weekly, and I’ll try to do better than that once I’m settled into my new environment.

Will I participate in scholarly conferences and workshops?

Of course! I’m co-organizing SSM 2010, which will be held in conjunction with WSDM 2010 in February, and of course HCIR 2010, which will be held in conjunction with IIiX 2010 in August. You probably won’t see me at vendor fests, but I do hope to continue bringing industry practitioners and academic researchers together.

Will I blog about Google?

I certainly won’t disclose any confidential information–people get fired for that–or worse. And, given how much access I will have to such information, I will err on the side of caution, only discussing information that I’m sure Google has released to the general public. Beyond that, I’ll exercise common sense. I don’t want to either come across as a shill for my employer or to spar with my new colleagues in public. Subject to those constraints, however, I can and will blog about Google.

Can I get you a job at Google?

I can advise you and connect you to a recruiter, but that’s the limit of my power. The hiring process here is specifically designed to prevent any individual from manipulating it–even me!

Will I talk about what I’m working on?

See above regarding confidential information. I’ll be delighted to talk about anything I’m working on that Google has decided to disclose publicly.

Does Google know about my karaoke habit?

Too late, they’ve already signed the offer letter. 🙂

Categories
Uncategorized

Apologies for Slow Response Times

I am without my own laptop for a few days as I manage a transition between jobs. So I apologize in advance if I am slow to respond to email, comments, or tweets over the weekend. I’ll be back at full steam early next week.

Categories
Uncategorized

Going (to) Google

McGoogle

This is my last week at Endeca. The decision to leave has been a heart-wrenching one: not only have the past ten years been the best of my life, but my experiences at Endeca have defined me professionally. Moreover, Endeca is riding a wave of success with recent advances in our products, new relationships with key partners, and fascinating new deployments.  (You can read Endeca’s latest announcements in our newsroom).

Ironically, it is this very success that compels me to move on. In the past several years, I have developed an increasing passion for search on the open web–an interest only furthered by the explosion of social media.

That is why I’ve decided to accept an opportunity at Google’s New York office. Readers here know that I’ve been a very public critic of Google’s simplistic approach to user interaction on the open web. I’m being offered an opportunity to help fix that approach–and it is an offer I can’t refuse. My mission is to apply my passion for human-computer information retrieval (HCIR), an approach that Endeca has pioneered in the enterprise, to the world’s largest information problems–and where better to do that than at the company that aspires to organize the world’s information.

This moment is bittersweet: I am excited about the new experiences that await me, but I have a heavy heart as I turn in my badge and part with a world-class team that has succeeded against incredible odds.

Given my role and tenure at Endeca, I want to say explicitly that this move is about my personal ambition. My passion for web search and social media, which have grown exponentially over the past couple of years, simply doesn’t align with Endeca’s focus in the enterprise.

Also, I want to make clear: Google hired me because of my values, and not in spite of them. I know that some folks will find it difficult to reconcile my criticisms of Google with my decision to join. That’s why there’s an opt-out village! Seriously, though, I take my values with me. Google is offering me the opportunity to channel my passion for HCIR into action, on the world’s largest stage. I’m well aware of the magnitude of the challenge, but hey, I’m feeling lucky.

Categories
General

Twitter Lists as an Influence Measure?

Influence

In “Using Twitter Lists To Judge Influence“, Todd Zeigler of the Bivings Report writes:

I think Twitter Lists will end up helping separate the men from the boys when it comes to influence.  In addition to seeing a Twitter users follower count, we can now see the number of other Twitter users who have added them to lists (example to the right).  I would argue that getting added to a list is a bigger deal than simply getting someone to follow you.

I’m certainly intrigued by Twitter Lists, but I’m skeptical that counting how many lists someone is on will prove that much more useful than follower count. For example, I currently have 1159 followers, am on 33 lists, and have a TunkRank of 24.1. For grins, here’s a handful of people who have similar stats:

While I can’t generalize from a few arbitrarily selected data points (though Gladwell seems to have no trouble doing so in Outliers), my suspicion is that list count will be highly correlated to follower count–and may actually be a noisier signal because the numbers are so much smaller.

Of course, there’s no reason we should use raw list counts–any more than we should use raw follower counts. Just as TunkRank aspires to model attention scarcity and recognizes that not all followers are created equal, an effective measure of how lists contribute to influence must recognize that not all list memberships are created equal either.

I’ve been chatting with Chris Langreiter, who is working on enhancements to TunkRank to address some of the oversimplifications of its model, as well as with Jonathan Glick and Ken Reisman at TLists. I’d like to see online influence–on Twitter and in general–measured more effectively. It will be great if lists can help, but we can’t make the same naive mistakes as those who were quick to embrace follower count as a measure of authority.

Categories
General

Tuning in to Google Music Search

With all of the activity around e-books last week, you might think that the online world wasn’t paying attention to the media category most transformed by the Internet music. But a week is a lifetime in the ADD-addled technology press, and today’s top story is that Google is “making search more musical“. From the official blog post:

Now, when you enter a music-related query — like the name of a song, artist or album — your search results will include links to an audio preview of those songs provided by our music search partners MySpace (which just acquired iLike) or Lala. When you click the result you’ll be able to listen to an audio preview of the song directly from one of those partners.

As with most Google features, this one is being rolled out gradually. If you’re impatient (like me), you can try it directly from this page. Or you can watch the video above.

My first impression: this is great feature to improve known-item search, and it’s nice that they’ve partnered with folks that often let you hear whole songs, rather than 30-second snippets. The selection seems limited, but it could be that my tastes are a bit obscure. I’m curious if others share my sense that the catalog is much smaller than the ones on iTunes or Amazon.

But, as music IR specialist and fellow HCIR advocate Jeremy Pickens points out, Google is “doing to music what they did to the web“. I’m not as concerned as Jeremy is about the prospect of musical tastes being homogenized through the “rich get richer” effect of ranking–perhaps because we’re already there. Not only is pop music self-perpetuating (see this great study by my friend (and Princeton sociologist) Matt Salganik and his former advisor Duncan Watts), but even recommendation engines quash diversity. Google really can’t make things that much worse.

Besides, much as Google’s default search leads many searchers to Wikipedia, a great starting point for exploratory search, the new music search leads users to Pandora, which is probably the leading engine for exploratory music search offers users a more exploratory user experience (though it would be great if they also linked to last.fm) (thanks Jeremy!). OK, maybe “leads” is a strong word for a “listen on” link below the search result, but it’s there for people in the know.

I’d love to see Google embrace HCIR. But I appreciate the improvements to known-item search too, especially if they can delegate the HCIR functionality to others that focus on it.

Categories
General

Ben Shneiderman’s HCIR 2009 Keynote: The Future of Information Discovery

The slides for Ben Shneiderman‘s HCIR 2009 keynote on “The Future of Information Discovery” are now available on the workshop web site. I’ve also taken the liberty to upload them to SlideShare and embed them here. The slides don’t do justice to Ben’s presentation style, but hopefully they at least communicate a taste of the material he covered and his vision of where HCIR needs to go as a field and community.

Categories
General

Google Experimenting with Social Search

Google may be an also-ran in the social networking market with its Brazil-centric Orkut service, but that hasn’t stopped the search giant from adding social features to its products. A post at the (unofficial) Google Operating System blog recounts the history of Google Reader’s social evolution, up to but not including its latest update last week. SearchWiki, though not a social search feature per se, allows users to share personal annotations of their search results, as does the more recently introduced Sidewiki. And, like Bing, Google has established a partnership with Twitter in order to surface “social” results.

But the feature announced today, which Google is actually calling “Social Search“, is a much bigger step, even if it is tucked away as an experiment on Google Labs. From the official blog post:

With Social Search, Google finds relevant public content from your friends and contacts and highlights it for you at the bottom of your search results. When I do a simple query for [new york], Google Social Search includes my friend’s blog on the results page under the heading “Results from people in your social circle for New York.” I can also filter my results to see only content from my social circle by clicking “Show options” on the results page and clicking “Social.”

I gave it a whirl, search for “noisy channel” and then restricting the search to content from what Google considers my social circle. The results are as promised, and could further refine to results by author name, selecting from a familiar list of Neal Richter, Jason Adams, Daniel Lemire. Ken Ellis, and Joshua Young (though for some reason Josh’s link didn’t work). Cool! Except that there are a lot of names missing (check out the bloggers in The Noisy Community) and, more importantly, I can’t further refine or even sort the search results. Indeed, the ordering of search results seems quite arbitrary–a phenomenon I’ve noticed more generally for search engine ranking of social media content.

In short, Google Social Search is a welcome initiative, but there’s a lot more work to do before I would find a productive use for it. Given the mismatch between social search and black-box relevance ranking, a little bit of HCIR would go a long way towards making this feature practically useful.

Categories
General

HCIR 2009: Human-Human Interaction

On Friday, I had the privilege of seeing just how much the annual Workshop on Human-Computer Information Retrieval has grown up since I conceived it in the summer of 2007. Back then, my co-conspirators and I worried about attracting a critical mass of participants–indeed, Endeca employees easily accounted for a quarter of the attendees (and submissions) at the first HCIR workshop. And even last year host and co-sponsor Microsoft Research supplied a disproportionate share of the attendees.

But this year was different. We were overloaded with strong submissions from all corners, and we had to turn people away for lack of capacity! While we didn’t relish saying no to prospective participants, these are great problems to have! And, thanks to Nick Belkin and Diane Kelly, we’ve arranged to greatly increase that capacity at HCIR 2010–more on that in a moment.

Max Wilson has already written up an excellent summary of the workshop, which I encourage you to read. You can also see the live tweet stream at #hcir09. Rather than duplicate these efforts, let me add my personal reflections as an organizer and participant.

Ben Shneiderman‘s keynote address was sweeping and inspiring. I expected him to talk about information visualization, the area where he is most known for his contributions. He did present some examples of his group’s work on visualization-centric interfaces to support medical research, but his overall presentation took the much more ambitious approach of discussing the past, present, and possible future of HCIR. Specifically, he urged us to link our work to societal goals, such as the United Nations Millennium Development Goals. His challenge may seem impossibly idealistic, but I agree with his assertion that it is a practical one: we will do our best research by grounding ourselves firmly in the real and pressing problems of our age. Last year’s keynote speaker went on to win the Gerard Salton Award; I can only hope that Ben receives comparable accolades for his past accomplishments and future contributions to HCIR.

A new feature for this year’s workshop was having a “poster boaster” session, in which each of the presenters in the poster session had one minute to pitch his or her work.  For those of you unfamiliar with this format, I highly recommend it. The compressed format forces presenters to distill the essence of their contributions–a useful exercise in general. And the audience doesn’t get bored: if you decide halfway into a presentation that you aren’t interested, then you only have to wait 30 seconds until the next one! Not the we had that problem: the posters were consistently interesting, as the submissions were unusually strong this year. You can download the full workshop proceedings here.

Even the full presentations weren’t that long. The five speakers were each allotted ten minutes, with a healthy amount of time reserved for a panel-style Q&A sessions. The papers in this session were, by design, some of the more controversial ones. In particular, Ellen Voorhees delivered a full-throated defense of Cranfield / TREC-style evaluation: “I Come Not to Bury Cranfield, but to Praise It” (similar to her presentation at the 2006 Workshop on Adaptive Information Retrieval that I discussed on this blog last year). Her reminder of HCIR’s challenges on the evaluation front surely ruffled some feathers, but all of us HCIR avocates need to address these challenges if we want researchers (and practitioners) outside our community to drink our kool-aid.

The above format was already quite interactive (as befits a workshop about interaction), but the second half of the day was explicitly designed to facilitate discussion. We had lunch on site, followed by a one-hour poster session.  We then had two one-hour guided discussion sessions to address the theoretical and practical concerns of HCIR. As organizers, we seeded both sessions with questions, but we also incorporated concerns that had come up during earlier discussions.

Finally, I am grateful to our sponsors. Catholic University was a gracious host and sponsor, providing the workshop with a great space and very helpful student volunteers. Between that and the financial contributions of Endeca and Microsoft Research, we were able to continue our tradition of not charging attendees for the workshop. I can’t promise that will continue indefinitely, but I am glad that our insistence on emphasizing substance over frivolous amenities has helped us deliver what I believe to be some of the best bang-for-buck in the scholarly community.

I’m already excited about HCIR 2010. Unlike the past three workshops, which have been held as independent events, next year’s workshop will be co-located with the Information Interaction in Context Symposium (IIiX’10) in New Brunswick, New Jersey. The workshop will take place on August 22nd, breaking our unintended tradition of holding the workshop on October 23rd. Nick Belkin assures us that there will be lots of space, so hopefully we’ll be able to accommodate everyone who is interested. We’ll also be soliciting sponsors for both the workshop and the broader symposium.

But there’s more to HCIR than enjoying each other’s company at workshops. We must spend the remaining 364 days of the year fleshing out our vision, and relating that vision not only to the disciplines HCIR explicitly integrates, but to pressing social concerns. It is up to us all to make our work relevant.