The Noisy Channel

 

Amazon: Customers Who Bought Related Items Also Bought

January 31st, 2009 · 14 Comments · General

Amazon: Customers Who Bought Related Items Also Bought

Perhaps Amazon has had this feature for a while, but today, for the first time I noticed a section labeled “Customers Who Bought Related Items Also Bought” as seen in the screen shot above. I was looking at an unreleased book, which might explain why they couldn’t show me information based on customers who actually bought the item.

Has anyone else noticed this? Am I just late to the party? I tried to find more information online, but nothing showed up. I assume they are using some item similarity measure to assemble a set of related items, and then are basing collaborative filtering on the purchase history associated with that set.

I’m very curious to hear more from anyone who is familiar with this functionality.

14 responses so far ↓

  • 1 Michael Bernstein // Jan 31, 2009 at 6:02 pm

    From what I’ve read of this, it’s using item-item collaborative filtering. Most collaborative filtering has a matrix of items-users, trying to figure out what items a particular user would want. This one is using items as both the rows and columns of the matrix, trying to figure out what other items are popular given an interest in a starting item.

  • 2 Daniel Tunkelang // Jan 31, 2009 at 6:42 pm

    But in this case, the item it’s using has no purchase history. So they must be taking a semi-supervised approach, using some rule-based or statistical similarity measure to identify related items that do have a purchase history, and then combining the collaborative filtering results obtained from those items. I’ve seen this idea described in research papers on data mining, but this is the first time I’ve seen it implemented.

  • 3 Bryan Zug // Jan 31, 2009 at 8:07 pm

    Couldn’t they also be looking at a correlation of pre-orders or wishlists for the product to seed the correlation? for products that are not yest released, it seems like an actionable substitute to drive results.

  • 4 Stefan Constantinescu // Jan 31, 2009 at 8:16 pm

    I’ve been using the feature you described for years to help me find music and films to download and books to read.

  • 5 Daniel Tunkelang // Feb 1, 2009 at 7:32 am

    Stefan, have you seen this feature enabled for pre-release items not yet available for sale? That’s the part that struck me as new.

  • 6 Daniel Tunkelang // Feb 1, 2009 at 7:35 am

    Bryan, that’s possible, but in this case I doubt it. I happen to like the book I used as an example (I reviewed it for the publisher), but I doubt enough people know about it to have pre-ordered it. Maybe more people know now, since this post made it onto Techmeme. :-)

  • 7 jakrose // Feb 1, 2009 at 8:52 am

    this has been around for years. they also look to your IPs history, amazon browsing history of your account, geo stats, etc.

    amazon attributes a huge chunk of their profits to recommendations. it works. plenty of white papers on it out there to dig through.

  • 8 Daniel Tunkelang // Feb 1, 2009 at 9:03 am

    Jakrose, I know Amazon’s been using collaborative filtering for years, as well as other forms of personalization. Do your research on me–you’ll see I’m hardly a stranger to this space.

    But have you seen this specific feature before–customers who bought *related items* also bought? I haven’t, and I found no leads when I searched. That’s what I thought was newsworthy.

  • 9 Chris Betti // Feb 2, 2009 at 10:12 am

    I wonder if they’re using the same notion of related content available through the Amazon Associates Web Service. The public API defines a RelatedItems response group. The user can (is required to?) submit a RelationshipType value in order to get one of these groups back. Sample values include Episode, Season, Tracks, and Variation (you can guess how that might apply to the various lines of products Amazon sells). See this link for more information: http://docs.amazonwebservices.com/AWSECommerceService/2008-08-19/DG/index.html?CHAP_OrganizationofItemsforSaleonAmazon.html

    The other thought I had was library published book organization systems. “Related items” might mean items close to the item in question in the library stacks.

    I don’t think either of these suggestions would require very much oversight. Other than that… could the google image labeler game be applied to item similarity somehow? I’d like to play that.

  • 10 Daniel Tunkelang // Feb 2, 2009 at 10:47 am

    Chris, I did see that page when I tried to research the feature. But it left me with two questions:

    1) What RelationshipType values or combination thereof do they use? This is content-driven similarity, not the collaborative filtering for which Amazon is famous. That’s not to say they can’t do both–they obviously do. But I’m curious if they’ve published anything about it.

    2) How do they then use the set of content-driven related items as inputs to their collaborative filtering engine? Do they assign weights based on some measure of similarity? How do they account for the diversity of results, which might confound a vector-based approach?

    OK, that’s more than two questions. But it gives you an idea of why I find this so interesting. And why I’m surprised not to find anything about it on the web. For all I know, this feature has been available for a while, but no one else seems to have taken the time to notice it.

  • 11 Lee Romero // Feb 3, 2009 at 9:43 am

    I am not sure how they might be pulling off the “related items” concept, but I’d be interested in any insights you gain on it, Daniel (so hopefully, you’ll share what you find in a future post). I’ve tried to work out a way to do this with information about employees in an enterprise – I have built a simple solution but not one I’m happy with. I’ve described the work on my blog but haven’t made any progress since that write-up.

  • 12 Daniel Tunkelang // Feb 3, 2009 at 12:12 pm

    Lee, I’ll share what I learn. I think the interesting question is what approach they take to content-based similarity, especially given that their products have nominal rather than numerical attributes. I looked at this problem several years ago; you can find my SIAM Data Mining 2002 paper here:

    http://www.cs.cmu.edu/~quixote/NearestNeighbor.pdf

  • 13 Max L. Wilson // Feb 10, 2009 at 6:20 am

    I’m looking at another unpublished item (http://www.amazon.co.uk/Exploratory-Search-Query-response-Synthesis-Information/dp/159829783X/ref=sr_1_2), which doesnt have any of these other people who bought related items recommendations. however it does have three indications of the notion of related items: subjects, categories, and tags.

    Surely this system is just picking the most similar books and giving you their standard collaborative filtering results?

    Ian Ruthven gave a great keynote at IIiX2008 in London a while back on context, and discussed the full range of amazon’s similarity assessments, particularly focusing on books (http://irsg.bcs.org/iiix2008/presentations/Ruthven.ppt)

  • 14 Daniel Tunkelang // Feb 10, 2009 at 7:49 am

    “Surely this system is just picking the most similar books and giving you their standard collaborative filtering results?”

    I don’t doubt it, but that’s a bit underspecified. How many “most similar” books? Do they contribute equal weights, or are the weighed based on the degree of similarity? For that matter, is the weighting linear? Do they do anything to address diversity within the set, i.e., books that are similar to the unpublished item but very different from one another? And how “standard” is their collaborative filtering in the first place?

    Regardless, thanks for the pointer to Ruthven’s presentation. Interesting stuff, even if it doesn’t answer the above questions.

    And, on an unrelated note, I hope you like the subtitle of that book you were looking at. I’ll vouch for the quality of the contents; my own contributions as a reviewer were cosmetic.

Clicky Web Analytics