Big news from Google, Yahoo, and Microsoft: the three web search leaders announced yesterday that they will jointly support a standard by which a web page can indicate the address of its “canonical” version. By using this standard, a site can avoid the problem of indexing duplicate copies of pages and suffering, from an SEO perspective in terms of how well those pages are indexed.
You can find coverage at:
- CNET News: Search giants join to tidy up Web addresses
- Search Engine Watch: Duplicated Confusion: The Canonical Edict From The Big Three
- SEOmoz: Canonical URL Tag – The Most Important Advancement in SEO Practices Since Sitemaps
This is a great development for everyone, but especially for anyone building sites that use faceted search (which should be everyone!). One of the problems we identified early on at Endeca is that faceted search, if implemented naively, can lead to massive duplication of URLs. The whole point of a faceted information architecture is that there are many paths that lead to a given product or document page.
For example, consider a page that is associated with values from 10 facets. There may be 10! = 3,628,800 ways to reach it–and that’s assuming that none of the facets are hierarchical. In fairness, it also assumes that none of the paths contract from implicit selection. Regardless, the number of paths is large enough to be a problem for SEO if each path receives its own URL.
Endeca recognized this problem a while ago, and addressed it through what we call “URL beautification”–our own means of canonicalizing URLs that, in addition to deduping the multiple paths, has the side benefit of creating URLs that are SEO-friendly.
Nonetheless, my colleagues and I are delighted to see the major web search engines recognizing this problem and making it easier for everyone to solve it. It’s a rare day to see Google, Yahoo, and Microsoft working together, but it’s nice when it happens. Good thing they got the news out before “Be Evil” day!