Kubnal Bridge

Technical SEO

Schema (Structured Data Markup)

Last reviewed

Schema markup is structured data added to a webpage using the Schema.org vocabulary — a shared standard maintained by Google, Microsoft, Yahoo, and Yandex. Rather than asking crawlers to infer meaning from prose, schema labels content directly: this is the Organization, this is its founder, this is an Article published on a specific date, this is a FAQPage with these Question / Answer pairs.

The recommended serialization format is JSON-LD, embedded in a `<script type="application/ld+json">` tag in the page head or body. Microdata and RDFa exist as alternatives but are largely deprecated in modern implementations because JSON-LD decouples markup from rendered HTML, simplifies maintenance, and reduces the risk of cloaking violations.

For schema to function, it must be present in the initial server-rendered HTML — not injected after hydration via a framework component. Crawlers like Common Crawl, GPTBot, and PerplexityBot do not execute JavaScript, so a schema script that lives only in the React hydration stream is invisible to them. The same `<script>` tag must appear when the page is fetched with `curl` and when it loads in a browser.

Schema implementation is validated using Google's Rich Results Test and the Schema.org Validator. Performance of schema-driven rich results is monitored in Google Search Console's Enhancement reports — separate dashboards for Products, FAQs, Events, Articles, and other entity types. Crawlers such as Screaming Frog can audit structured data coverage across an entire site.

Common entity types in B2B and editorial sites include Organization, Person, WebSite, BreadcrumbList, Article, FAQPage, Service, DefinedTerm, and Product. The most defensible implementations cross-reference entities using `@id` URIs (e.g., `https://example.com/#organization`), allowing Google's knowledge graph and AI engines to resolve relationships unambiguously.

Why it matters in GEO / AI search

In traditional SEO, schema unlocks rich results — star ratings, FAQ accordions, breadcrumb crumbs, and sitelinks search boxes — which expand a result's SERP real estate and lift click-through rate. But in GEO / AI search, the role is more fundamental: schema is the primary signal AI engines use to disambiguate entities when deciding whether to cite a source.

When ChatGPT, Perplexity, Claude, or Gemini retrieve a page, the model has to answer a brittle question: "Is this page about the same thing as the user's query?" Plain HTML forces the model to infer from prose, which is noisy. JSON-LD answers the question definitively: `@type: Organization`, `@id: https://example.com/#organization`, `sameAs: [linkedin, x, crunchbase]` — three facts that resolve the entity in a single parse.

For new domains without an established Common Crawl footprint or Wikidata entry, structured data is often the only reliable entity-disambiguation signal available. Sites that ship Organization + Person + WebSite + per-page Article/Service schemas measurably out-cite identical-content sites that don't.

Examples

Organization + Person cross-link

Emit Organization with `founder: { @id: #founder }` and a separate Person with `worksFor: { @id: #organization }`. The bidirectional `@id` resolution is what Knowledge Graph systems look for.

FAQPage tied to visible Q&A

Wrap your visible FAQ section in FAQPage schema. Each Question.name must match the visible H3 text verbatim, and Answer.text must match the visible paragraph — Google demotes schema with hidden content.

Article with publisher reference

Article schema with `author: { @id: #organization }` and `publisher: { @id: #organization }` is the schema.org-compliant pattern for branded editorial content (Stripe blog, Notion blog).

BreadcrumbList per leaf

On a glossary or category leaf page, BreadcrumbList declares Home › Section › Term. Google uses this to render breadcrumb crumbs in mobile SERPs and AI engines use it for hierarchy inference.

Authority Links

Related Terms