Canonical

What is Canonical?

If there are multiple versions of similar pages, the canonical rel-tag tells the WebCrawler that the page linked is the definitive version. Each non-canonical page must link to the canonical version with this link.

A canonical tag (aka “rel canonical”) is a way of telling search engines that a specific URL represents the master copy of a page. Using the canonical tag prevents problems caused by identical or “duplicate” content appearing on multiple URLs. Practically speaking, the canonical tag tells search engines which version of a URL you want to appear in search results.

Do canonicalization matter?

Duplicate content is a complicated subject, but when search engines crawl many URLs with identical (or very similar) content, it can cause several SEO problems. First, if search crawlers must wade through too much duplicate content, they may miss some of your unique content. Second, large-scale duplication may dilute your ranking ability. Finally, even if your content does rank, search engines may pick the wrong URL as the “original.” Using canonicalization helps you control your duplicate content.

The problem with URLs

You might be thinking “Why would anyone duplicate a page?” and wrongly assume that canonicalization isn’t something you have to worry about. The problem is that we, as humans, tend to think of a page as a concept, such as your homepage. For search engines, though, every unique URL is a separate page.

Advertisements

Examples, search crawlers might be able to reach your homepage in all the following ways:

  • https://www.example.com
  • https://www.example.com
  • https://example.com
  • https://example.com/index.php
  • https://example.com/index.php?r…

To a human, all these URLs represent a single page. To a search crawler, though, every single one of these URLs is a unique “page.” Even in this limited example, we can see there are five copies of the homepage in play. Though, this is just a small sample of the variations you might encounter.

Modern content management systems (CMS) and dynamic, code-driven websites exacerbate the problem even more. Many sites automatically add tags, allow multiple paths (and URLs) to the same content, and add URL parameters for searches, sorts, currency options, etc. You may have thousands of duplicate URLs on your site and not even realize it.

 

« Back to Glossary Index