Categories
Uncategorized

Happy Birthday, Noisy Channel!

Actually, I’m a few days late: Monday, April 6th was the 1-year anniversary of The Noisy Channel. I was looking back at my first post and I’m happy to see that, while both the content and readership of this blog have broadened over the past year of growth, my focus has stayed true to my original inspiration to blog.

I don’t know what the next year will bring, but I’m glad to know that I start it with a thriving community of readers and contributors that I couldn’t imagine assembling when I started this venture a year ago. I promise to continue working to earn your attention and spend it wisely.

Categories
Uncategorized

Guest Post: Exploring Visual Similarity with Modista

Guest posts are always a pleasure, but today I’m particularly delight to have AJ Shankar of Modista share his thoughts about visual exploration.

My friend Arlo and I co-founded Modista, a web site that uses computer vision algorithms and a novel user interface to enable product exploration and discovery on the web. The premise is simple: click on an item (a shoe, watch, handbag, …) and Modista will automatically show you items that look similar to it. Repeat the process and soon you’ll find yourself exploring hundreds of items in an intuitive fashion — one of which you’ll want to buy (hopefully).

Rather than focusing on Modista’s two main components, its computer vision algorithms and its structured grid display, I’d like to discuss some of the less obvious, but still important, goals we have for the site.

Goal: Take exploration cues from real life

One of our goals with Modista is to bring a bit of the real-life mall shopping experience to the web. When you go to a mall to buy a pair of shoes, you’re more likely to come out with one than if you visited an online retailer, even though the online shop has a much, much larger selection, often at lower prices to boot. Why?

We think a major component of the answer is the ease of exploration. It’s trivial to walk around a shoe store and notice something that strikes your fancy.

(Some other components — like the ability to try on a shoe for size or feel its texture — are tricky, and we leave it up to our retailer partners to approximate solutions to them, such as fast, free shipping, and a liberal return policy. Others, such as the desire to justify the time investment of going to the mall with a purchase, are even further out there.)

Obviously our grid-based visual similarity interface is meant to encourage the exploration process. But we also take a critical look at smaller interface issues: What’s disruptive? What distracts from the experience? For instance, we really dislike scrolling and page load wait times: when was the last time you had to wait for a rack of shoes to “load” at your local mall? So when we wrote the server for Modista, we made sure it could operate with extremely low latency, and when we designed the website, we used AJAX for the page transitions, and made sure that we displayed just as many items as would fit on the browser screen without any scrolling. Try resizing your browser to see how it works.

At the same time, we don’t shy away from the strengths of the web. You can still see all 160,000+ women’s shoes we catalog, and our algorithms sift through every one of them to find you the best matches. The mall has nothing on the web when it comes to inventory.

Goal: Encourage positive actions

Here’s a typical scenario at an online retailer’s website: a customer types in some search terms, and is faced with 20 pages of results, with 50 or more items per page. The user scrolls through the first page, finds nothing she wants to buy, and clicks “Next”. This process repeats several more times before the user gives up.

What has the retailer learned from this interaction?

Each “Next” click by the user is a negative action: the customer is saying “I don’t like any of these 50 items”. So after ten clicks, the retailer knows that the customer doesn’t like 500 of the 50,000 items it carries. If the retailer’s eventual goal is to find something that the customer does like, it’s a frustrating place to be.

The Modista interface encourages positive actions. The default behavior in a Modista browsing session is a click on an item. The click says, “Even if this isn’t the perfect item for me, I like it more than everything else I’m seeing here.” After ten such clicks, we have a much better understanding of what the user is looking for than if she just clicked “Next” ten times. Better yet, the user feels like she is making progress: the click is taking her somewhere better. We like that.

Goal: Be memorable

Modista is a portal site that aggregates inventories from lots of retailers. In that sense, we’re in the same space as comparison shopping engines (CSEs).

There’s a well-known problem with CSEs: they’re not memorable. Here’s a sample consumer thought process: “I need to buy a DVD player. Go to Google, type in some keywords… click on a link to [your CSE of choice]. Page loads, click on an offer at Amazon, buy the player. Great!”

Sounds good so far. But then the next time the consumer wants to buy an item, she thinks, “Well, what did I do last time? I went to Google… and bought the item at Amazon.” The CSE experience is so short and bland that it is left entirely out of the equation.

This is partially by design. CSEs get money for every click lead they generate, so they’re often constructed to get users clicking out quickly and frequently. For instance, not displaying size information next to a shoe may increase revenue, since the user will then have to click out to see if the shoe is in her size.

We decided to try a different approach. We aim to present all the information a user might want on Modista itself, and try our best to normalize product data (size, widths, multiple views, etc.) from different retailers to streamline the user experience. We’re perfectly happy if a user spends all day at Modista, and only clicks out once — to actually buy an item.

It turns out that this approach has been working pretty well. Many people find using Modista to be a little exciting and addictive, and we think that the continuity of experience plays a major part in this. Users spend an average of 15 minutes on Modista, and returning users spend more than 20. We hope those 15-20 minutes will help them think of Modista the next time they want to buy some shoes or a handbag.

We’re always working on ways to improve Modista. Naturally, a main sources of information is our users — whether by unprompted feedback, site metrics (we track pretty much everything a user does), or videotaping sessions — so I’d love to hear your thoughts on how we can further improve the site!

Categories
Uncategorized

Announcing HCIR ’09!

I am proud to announce that HCIR 2009, the third Annual Workshop on Human-Computer Interaction and Information Retrieval, will take place at the Catholic University of America in Washington, DC on October 23, 2009!

You’ll recognize the familiar crew of organizers:  Bill Kules, Ryen White, and yours truly. And we’ve lined up a great keynote speaker: Ben Shneiderman, professor at the University of Maryland and founding director of the Human-Computer Interaction Laboratory.

In contrast to previous years’ workshops, we will de-emphasize full-paper presentations and instead focus on what participants have told us they found most valuable: posters and directed group discussions.

Also, as I mentioned in an earlier post, we see this year’s workshop as an opportunity for the HCIR community to engage with the federal public sector. If you would like to get involved or you have any questions about the workshop, please reach out to me, either publicly here or privately by email.

Categories
Uncategorized

Great Blogging Tips from SEOmoz

I know that a number of you here are bloggers and trying to earn greater visibility for your blogs. I suggest you check out the “21 Tips to Earn Links and Tweets to Your Blog Post” at the SEOmoz blog. It’s great advice that you can follow with a clear conscience.

Categories
Uncategorized

What Exactly is the Associated Press Announcing?

The blogosphere is in a tizzy over a press release from the Associated Press that begins as follows:

The Associated Press Board of Directors today announced it would launch an industry initiative to protect news content from misappropriation online.

AP Chairman Dean Singleton said the news cooperative would work with portals and other partners who properly license content – and would pursue legal and legislative actions against those who don‘t.

“We can no longer stand by and watch others walk off with our work under misguided legal theories,“ Singleton said at the AP annual meeting, in San Diego.

As part of the initiative, AP will develop a system to track content distributed online to determine if it is being legally used.

The rest of the press release is about rate reductions and new “Limited” service–none of which are attracting much attention. Rather, everyone from the New York Times to Gawker is treating this press release like a declaration of war.

While the AP’s tone is angry, it seems premature to comment on the substance of their tactics until we know more. Saying they’ll use legal means to fight illegal activity is not only vague, but hardly objectionable in principle. Why don’t we wait to find out what they’re actually planning to do before going medieval on them?

Of course, if I get sued for copying over 100 words of their press release without licensing it, then I suppose I might change my tune.

Categories
Uncategorized

Google Preferred Sites

I just read over at Micro Persuasion that the Google’s experimental “preferred sites” feature is now available to all users via Google Labs. The feature allows users to specify web sites that receive preferential treatment in their personal search  results.

I like the feature in theory–it moves some of the control from Google’s black-box relevance ranking algorithm to the user, which is where I think it belongs. I would like to know exactly what “preferred” means–I still am wary of opaque personalization algorithms.

In practice? Well, I can’t tell you, because I couldn’t get the “My preferred site” links to show up, either on Firefox or Internet Explorer. Would be curious to hear from others who have.

Here’s a screen shot from Google:

It debuted in January but at the time was not available to all users. Now anyone can sign up for the feature via Google Labs.

Categories
Uncategorized

Data Mining Case Studies Workshop and Practice Prize

I was recently alerted about the Data Mining Case Studies Workshop and Practice Prize:

The Data Mining Case Studies Workshop and Practice Prize was established to showcase the very best in data mining case deployments. Data Mining Case Studies continues into its third year, to be held at KDD2009. Data Mining Case Studies will highlight data mining implementations that have been responsible for a significant and measurable improvement in business operations, or an equally important scientific discovery, or some other benefit to humanity.

The site states a final submission deadline of April 8th, but one of the organizers told me that they are willing to offer extensions. Further details and contact information are available at http://www.dataminingcasestudies.com/.

Categories
Uncategorized

Gmail Search Autocomplete

Google just announced a new experimental feature through Gmail Labs that auto-completes searches in Gmail application based on your contacts and its operator syntax:

Gmail Search Autocomplete

It’s a nice feature, and it does a lot to mitigate the lack of navigational functionality in the Gmail interface. Unfortunately, it requires using the current Gmail interface, which is not comparible with the CustomizeGoogle Firefox extension.

One thing I’ve never understood is that Gmail search engine doesn’t offer spelling correction. That seems like an odd omission from the company that popularized “did you mean?”. Does anyone know why this feature is unavailable in an application where it would be extremely useful?

Categories
Uncategorized

Wired.com Gutted: It Wasn’t Me

I know I went on a bit of a rampage about Wired.com editor-in-chief Chris Anderson today after finding out today how he’d misquoted Peter Norvig last year. I just read in Gawker that Wired.com is being “gutted”. They cite this Silicon Alley Insider post (which now adds this update: Earlier, we were told by a source that Wired.com was gutted in New York. Now we’re told by the company that only three people were let go at Wired.com. A rep adds: “None of Wired.com’s editorial content will change in any way.”).

I just want to say that I didn’t know about the cuts until just now, and I take no pleasure in reading about them. I feel for the staff–these are tough times in offline and online media.

Categories
Uncategorized

Duck Duck Kumo?

I just read in Advertising Age that Microsoft is planning to spend as much as $100M on a marketing campaign for its new “Kumo” search engine. For perspective, that’s about as much as they spent to acquire Powerset, and almost as much as Endeca’s revenue in 2007. And they’re spending it on ads. I’m not an Apple fan boy by any means, but I can’t help thinking of this  “I’m a Mac / I’m a PC” clip. Still, in a multi-billion dollar business, I suppose $100M is chump change.

But what jumped out at me from the article was this paragraph:

According to one person close the situation, the forthcoming campaign will be careful to not position “Kumo” as a competitor to Yahoo or Google and instead cast it as a reimagined search engine that ups the game by yielding fewer but more-focused results. The proposed strategy is probably a good — if not the only — way to go.

That sounds a lot like…Duck Duck Go. I know that Stefan Weitz, director of Live Search, and Gabriel Weinberg, who is Duck Duck Go, at least occasionally read this blog. I’m curious if my observations are even close, and what the coincidence of vision bodes either effort. I assume that Weinberg isn’t planning to spend $100M on advertising.