Kubnal Bridge

Technical SEO

Canonical Tag

Last reviewed

A canonical tag is an HTML link element placed in the `<head>` of a webpage — `<link rel="canonical" href="https://example.com/page" />` — that tells search engines which URL is the authoritative version when duplicate or near-duplicate content exists across multiple URLs. It was introduced jointly by Google, Microsoft, and Yahoo in 2009 to give site owners a way to consolidate ranking signals.

Canonicals are most often needed when the same content is reachable via multiple URLs: with and without `www`, with `http://` and `https://`, with and without trailing slashes, with tracking parameters (`?utm_source=...`), or with case variations. A canonical tag declares the preferred URL, and search engines fold the ranking signals from the variants into that single URL.

A "self-referencing canonical" is a canonical tag that points to the same URL the page lives at. This is the recommended default for every indexable page — it explicitly states "this URL is the canonical version of itself" and prevents accidental misinterpretation when the page is fetched via a slightly different URL form.

In Next.js (App Router), canonicals are set via the `alternates.canonical` field in the `metadata` or `generateMetadata` export. Inheritance is per-route, so a child page's canonical replaces the parent's. The single most common bug in Next.js sites is a root-layout canonical pointing to `/` that propagates to every unoverridden child page, making the entire site self-canonicalize to the homepage.

Validation tools include Screaming Frog's SEO Spider (detects canonical chains, conflicting tags, and non-self-referencing canonicals on key pages), Google Search Console's URL Inspection tool (shows which URL Google chose as canonical, which may differ from the declared canonical), and Ahrefs/SEMrush site audits.

Why it matters in GEO / AI search

In GEO and AI search, canonicals matter more than in traditional SEO because AI engines use the canonical URL as the entity identifier for the page. When ChatGPT cites a source, the URL it surfaces is the canonical one — not whichever variant happened to be retrieved. A misconfigured canonical means the cited URL doesn't match the page users land on, hurting both attribution and downstream click-through.

For sites built on Next.js, React, or other frameworks with a root layout, the single highest-leverage canonical check is: does every page have a self-referencing canonical, or do they all inherit a root canonical? Run `curl -sL https://example.com/some-deep-page | grep canonical` on five random URLs. If they all return the homepage URL, every page on the site is silently telling Google "I am the homepage" — a brand-new-site-killer error.

Canonicals do not prevent indexing; they consolidate signals. If you want a page out of the index entirely, use `noindex`, not canonical. The two interact in counterintuitive ways: a canonical pointing to a `noindex` page can suppress the canonical target itself. Cross-check both for any URL you want to remove or consolidate.

Examples

Self-referencing canonical (the right default)

Every indexable page emits `<link rel="canonical" href="<own absolute URL>" />`. In Next.js: `alternates: { canonical: 'https://example.com/the-page' }` in the route's generateMetadata.

Parameter consolidation

`example.com/products?utm_source=email` and `example.com/products` both carry `<link rel="canonical" href="https://example.com/products" />`. Tracking traffic and clean URL signals merge into the canonical version.

Cross-domain canonical

A guest post syndicated to a partner blog can canonical back to the original on your domain. Use sparingly — Google sometimes ignores cross-domain canonicals and picks its own.

Anti-pattern: site-wide root canonical inheritance

Root layout has `canonical: 'https://example.com/'` with no per-page override. Every page inherits and self-canonicalizes to the homepage. Result: the site appears to be a one-page site to Google and AI engines.

Authority Links

Related Terms