Content

Keyword Research

Last reviewed May 2026

Keyword research is the process of discovering, analyzing, and prioritizing the words and phrases potential customers use to find information, products, and services through search engines. It is the input that shapes content strategy, site architecture, and on-page optimization: which topics to write about, which terms to target per page, and what depth a piece needs to compete.

Modern keyword research evaluates four signals per term: search volume (monthly searches, sometimes seasonal), keyword difficulty (a competitive proxy derived from referring-domain strength of current top-10 pages), search intent (informational, commercial, transactional, or navigational), and SERP features (whether the SERP shows AI Overviews, featured snippets, video carousels, "People also ask," or local pack — each of which changes the click-through math).

The toolset is mature: Ahrefs Keywords Explorer (strongest backlink-derived difficulty data), SEMrush Keyword Magic Tool (largest keyword database), Moz Keyword Explorer, Google Keyword Planner (volume only, but it's Google's own data), Answer The Public (question-shaped queries), and Google Search Console (queries you already rank for but underperform on). For AI search, also relevant: Perplexity's related-questions panel and ChatGPT's conversational follow-ups, both of which expose latent query patterns traditional tools miss.

The strategic distinction in 2026 is between "rank-for-clicks" keywords (where the goal is to drive traffic to your site) and "cite-for-authority" keywords (where the goal is to be the source AI engines quote when answering related questions). The same keyword tools work for both, but the prioritization changes: cite-for-authority terms favor depth and fact density even at low volume, because each AI citation creates multiple indirect surfaces.

Output of a strong keyword research process is not a list of terms — it's a topic-cluster architecture: a primary "pillar" keyword per content cluster, supporting "cluster" terms for each spoke page, and an internal linking matrix that connects them. Without that structure, individual keyword wins don't compound into topical authority.

Why it matters in GEO / AI search

For traditional SEO, keyword research determines what gets written and what ranks. For GEO, it determines what AI engines cite — which means the input data changes. Search volume from Google Keyword Planner shows what users type into Google; it doesn't reflect what users ask ChatGPT or Perplexity, which tend to skew toward longer, more conversational, more comparison-heavy queries. A GEO-aligned research process incorporates AI-assistant query logs (where available), Reddit/Quora question patterns, and Perplexity's suggested follow-ups.

The "cite-for-authority" keyword set is often invisible to traditional volume metrics. Queries like "how does GEO differ from AEO" or "what is the right llms.txt configuration for a B2B SaaS" might have under 50 monthly searches in Google — but each AI citation on those terms creates compound surface area across ChatGPT, Perplexity, Claude, and Gemini, all of which surface the same underlying source to thousands of related queries. Low-volume, high-citability terms can out-perform high-volume head terms by 10x in pipeline contribution.

A common mistake is keyword research that optimizes for individual pages instead of topic clusters. AI engines (and increasingly Google) reward sites that demonstrate breadth across a topic: a single page about "schema markup" cites worse than a site that has pages on Schema, Organization schema, FAQPage schema, Article schema, BreadcrumbList schema, and rich results — all interlinked. Keyword research should produce cluster architectures, not flat lists.

Examples

Pillar keyword + supporting cluster

Pillar: "generative engine optimization." Cluster: llms.txt, AI crawler access, citability scoring, schema for AI engines, AI Overviews, ChatGPT optimization, Perplexity citations. Pillar page links to every cluster page; cluster pages link to pillar and to two sibling clusters.

Intent-stratified keyword set

For one product, separate intent buckets: informational ("what is X"), commercial ("X vs Y," "best X for Z"), transactional ("buy X," "X pricing"). Each bucket maps to a different page type — guide, comparison, product page — not a single keyword-stuffed landing page.

Low-volume citation goldmine

A term with 30 monthly Google searches but high relevance to GEO buyer questions ("how to verify ChatGPT cites my site") is worth more than a 5,000-volume head term, because every AI citation on it creates dozens of indirect surfaces.

GSC underperforming queries

Search Console shows you a page that ranks position 6-12 for a query with high impressions but low CTR. Those are the cheapest wins — the page already exists, the query is already getting impressions, and a structural rewrite (answer-first lead, schema, internal links) can push to position 1-3 within weeks.

Authority Links

Ahrefs — Keyword Research

Comprehensive guide to modern keyword research using volume, difficulty, and intent.

Moz — Beginner's Guide to Keyword Research

Foundational framework for keyword research and topic modeling.

Backlinko — Keyword Research

Data-driven methodology including SERP analysis and intent classification.

Related Terms

Content

Keyword

The words or phrases that most accurately represent the content on a page are called keywords.

Content

Keyword Cannibalization

When two or more pages on the same site compete for the same target keyword, causing search engines to split ranking signals across the duplicates instead of consolidating them on one strong page.

Content

Long Tail Keywords

Keywords the user query volumes of which are lower than those of short-tail keywords and usually consist of three or more words are called long-tail keywords.

Content

Short Tail Keywords

Keywords that consist of one or two words, offer the most general framework, and have the highest search volumes are called short-tail keywords.

Keyword Density Keyword Stuffing