On this page
Topical Drift Analyzer
Basics — what is topical drift?
Data & access
Analysis & methodology
Visualization & UMAP
Outputs & fixes
Technical capabilities
Audits
Topical Map Editor
Map basics
Map Generator
Visual Editor
Nodes, edges & data
Export & workflow
Plans & pricing
Pricing & limits
Basics
Topical drift 101
Topical drift is when a page slowly stops representing the topic and
intent it used to win for. It often happens after months of edits,
new sections, internal linking changes, or "helpful" expansions that
pull the page off-center. We measure drift using
semantic distance — lower means tighter topical
alignment; higher means more drift.
Content decay is typically about freshness, SERP changes, stronger
competitors, or better alternatives showing up. Drift is about
meaning mismatch: the page content, the internal link
context pointing at it, and the queries it actually attracts stop
agreeing. Many pages "decay" because they first drift.
Our report shows you where meaning is off and what to fix.
It's when an internal link's meaning in context
doesn't match the destination page's meaning. Most tools stop at
anchor text. We analyze the
anchor + surrounding text + the container/heading context
to estimate what the link is "claiming," then compare that to the
destination page embedding. High mismatch can send confusing topical
signals.
Actual distances (default) are cosine distances
computed from embeddings. We use these for trend tracking over time
because the scale is stable across crawls.
Normalized distances rescale a site's distances into a 0–1 range for easier relative comparison within a site and for composite scoring. Normalized mode is helpful when you want percentile-style zones or you're weighting multiple signals (like links + engagement).
Normalized distances rescale a site's distances into a 0–1 range for easier relative comparison within a site and for composite scoring. Normalized mode is helpful when you want percentile-style zones or you're weighting multiple signals (like links + engagement).
They are designed to work together across the content lifecycle:
- Topical Map Editor — plan your architecture before publishing. Build every page, URL, hub, section, FAQ, and internal link in a visual graph.
- Topical Drift Analyzer — monitor the architecture after publishing. Crawl your live site to find which pages have drifted away from the plan.
Plan → Publish → Monitor → Fix → Re-plan.
Data & access
What we use, what we store, and what we don't
We use your sitemap URL inventory, cleaned main page content
(headings + body text), internal link context (anchor + surrounding
text + container/heading context), and (optionally) Google Search
Console performance to ground the analysis in real query intent.
We generate embeddings from the cleaned text and use UMAP to
visualize semantic neighborhoods.
The crawl starts from your sitemap. In most cases that's enough
because it reflects your intended indexable inventory. If you have
important pages not in the sitemap, add them (recommended) or include
them via upload/override in the app. We fetch HTML, extract main
content, and generate embeddings for each page.
Read-only access for the property you want to analyze. We use
clicks/impressions/position signals to identify pages whose
query reality is drifting away from what their
content would suggest they should attract. GSC is optional, but it
makes drift detection and prioritization far more accurate.
We store analysis results and derived representations (embeddings,
scores, aggregates, and report outputs). We avoid storing raw HTML
whenever possible. Some features may store small extracted excerpts
needed to render the report — not full pages. You can delete
analysis data at any time. We do not sell or share your data.
Analysis & methodology
How the drift signal is computed
Embeddings are vectors that represent semantic meaning. We use them to:
- Measure page-to-page similarity (cosine similarity) and distance
- Group pages into semantic clusters (k-means on embeddings)
- Project to 2D for visualization (UMAP)
- Compare internal link context vs destination page meaning
- Calculate distance from your topical centre (centroid embedding)
UMAP is used for visualization only. Similarity, distance, and SDI
are computed in the original embedding space.
Raw HTML includes nav, footers, widgets, "related posts," and template
repetition. Embedding all of that makes vectors represent your
template, not your topic. Main-content extraction keeps what
matters (headings H1–H6, body text, lists, core containers) and
reduces boilerplate. That makes clustering cleaner and drift detection
more precise.
SDI is a composite score designed for prioritization. The default
weighting is:
60% semantic distance (topical alignment)
30% link penalty (internal linking weakness)
10% engagement penalty (GSC performance signals)
You can customize these weights depending on your workflow (pure content audit, link-first fixes, etc.). SDI is most useful in normalized mode, where scores are comparable within a site.
60% semantic distance (topical alignment)
30% link penalty (internal linking weakness)
10% engagement penalty (GSC performance signals)
You can customize these weights depending on your workflow (pure content audit, link-first fixes, etc.). SDI is most useful in normalized mode, where scores are comparable within a site.
Drift changes after meaningful updates — publishing new pages,
editing sections, restructuring internal links — not hour-to-hour.
Daily crawls add noise without changing decisions. GSC metrics are
delayed and fluctuate day-to-day, so daily drift dashboards often
create false alarms. Weekly or monthly crawls create a stable rhythm
that's easier to act on. Run a quick before/after crawl during active
optimization sprints; monthly is the sweet spot otherwise.
In Actual Distance mode (default):
• Core: ≤ 0.300 — excellent topical alignment
• Focus: 0.300–0.500 — strong alignment, on topic
• Extended: 0.500–0.700 — moderate drift, review recommended
• Peripheral: ≥ 0.700 — significant drift, needs attention
These are default heuristics and will be tuned as we calibrate across different site types.
In Normalized mode: Zones are percentile-based (0–0.3 = best 30%, etc.) for relative comparison within your site.
• Core: ≤ 0.300 — excellent topical alignment
• Focus: 0.300–0.500 — strong alignment, on topic
• Extended: 0.500–0.700 — moderate drift, review recommended
• Peripheral: ≥ 0.700 — significant drift, needs attention
These are default heuristics and will be tuned as we calibrate across different site types.
In Normalized mode: Zones are percentile-based (0–0.3 = best 30%, etc.) for relative comparison within your site.
Visualization & UMAP
Interactive radial map and semantic projection
UMAP (Uniform Manifold Approximation and Projection) reduces
high-dimensional embeddings to 2D while preserving semantic
neighborhoods. Pages positioned near each other represent similar
meaning — which often indicates linking opportunities and shared
topical intent. We use those 2D coordinates to visualize relationships
and to derive a consistent "around-the-circle" semantic ordering.
Radius (distance from center): Shows drift.
Pages farther from your topical center are more off-topic.
Toggle between actual distances (stable across crawls) or normalized
(0–1 for relative ranking).
Angle (position around the circle): Uses semantic projection by default (derived from UMAP's 2D coordinates), so pages that are similar in meaning sit near each other. You can also switch to group by cluster, distribute evenly, or group by problem type.
Angle (position around the circle): Uses semantic projection by default (derived from UMAP's 2D coordinates), so pages that are similar in meaning sit near each other. You can also switch to group by cluster, distribute evenly, or group by problem type.
Linking opportunities are pairs of pages that are semantically close
but don't currently link. The tool surfaces high-confidence candidates
using proximity thresholds plus intent and context signals. Adding
contextual internal links between close neighbors helps search engines
understand your structure and can strengthen topical authority.
Yes. You can:
- Toggle between actual/normalized distances
- Choose angle mode (semantic, by cluster, uniform, by problem type)
- Color by cluster, zone, SDI score, or drift severity
- Adjust node size (mapped to traffic)
- Show/hide labels (none / all / drifting only)
- Filter by cluster
- Zoom, pan, and click nodes to open pages
- Export as PNG for sharing
No. UMAP only affects the layout of the visualization
(where points appear on the 2D map). Drift scoring uses
cosine distance computed in the original embedding
space. The visual positions are for human readability only.
Outputs & fixes
What you get — and what to do with it
You get:
- Drift report with severity rankings
- Semantic clusters and hub structure
- Interactive radial map with UMAP + distance zones
- Internal link meaning mismatch list
- Linking opportunities (similar pages that should link)
- Zone distribution dashboard
- CSV exports for all core datasets
- Prioritized checklist and action plan
The fastest wins are usually:
- Adjusting link context (anchor + nearby text) to match destination pages
- Moving links into better semantic sections
- Re-centering headings/content to match intent
- Adding internal links between semantically close pages
- Consolidating overlapping/cannibalizing pages
- Strengthening hub pages with missing entities or sections
- Reviewing peripheral pages for consolidation, re-scope, or redirect
With actual distance mode, you can track whether fixes
reduce drift:
- Run a baseline crawl → note distances for problematic pages
- Implement fixes (content updates, link changes)
- Run a follow-up crawl (monthly is typical)
- Compare distances and zone movement over time
Lower distances = tighter topical focus. Moving from 0.650 to 0.480
indicates measurable improvement in topical alignment.
Yes — I can review your report and turn it into a concrete
implementation plan or handle execution directly. That includes
prioritization, rewriting link context, identifying consolidation
opportunities, and mapping linking strategies based on semantic
neighborhoods.
Contact me.
Technical capabilities
Under the hood
We use the same embedding model and configuration within a crawl so
comparisons are valid. If we change the embedding model or version in
the future we'll label that so historical comparisons stay honest.
We cluster page embeddings into topic groups using k-means.
The goal is to reveal hub structure: which pages form cohesive
neighborhoods, which pages are off-cluster, and which pages behave like
bridges. The report uses clusters to summarize your content architecture
and prioritize fixes by group.
UMAP is fast and does a strong job preserving semantic neighborhoods at
scale (hundreds of pages). t-SNE can be slow and can distort global
structure; PCA is fast but often collapses meaningful local similarity.
UMAP gives a practical balance for "what should link to what?"
visualization.
Yes. You can export CSV for:
- Page drift data (URL, distance, zone, cluster, metrics)
- Cluster membership and statistics
- Internal link meaning mismatches with similarity scores
- UMAP coordinates and semantic positioning
- Linking opportunities
Available on paid plans only.
Mostly yes — and we can make it reproducible.
Embeddings, similarity, and distance calculations are deterministic for the same input and model. Clustering and UMAP can vary slightly unless we fix random seeds. With fixed seeds, repeated crawls on unchanged content produce the same cluster structure and a stable map layout.
Embeddings, similarity, and distance calculations are deterministic for the same input and model. Clustering and UMAP can vary slightly unless we fix random seeds. With fixed seeds, repeated crawls on unchanged content produce the same cluster structure and a stable map layout.
Audits
Done-for-you remediation blueprint
A Topical Drift Audit is a
done-for-you remediation
blueprint
. I review your crawl outputs, validate the
highest-impact issues, and deliver a prioritized plan: what to fix
first, consolidation decisions, internal linking strategy, and a
30/60/90-day execution plan.
Choose an Audit if you want a clear execution sequence
with fewer "what do we do next?" decisions. If you prefer to execute
in-house and just need visibility + tracking, use the tool and upgrade
to unlock affected URLs and per-page diagnostics.
Typically includes:
- Executive summary + highest-impact wins
- Prioritized roadmap (what to fix first + why)
- Consolidation guidance (merge / trim / redirect decisions)
- Internal linking strategy (hubs, anchors, contextual placement)
- 30/60/90-day implementation plan
It helps, but it's not required. If you already have a crawl, include
it (or exports) and your goals. If not, I'll tell you exactly what to
run (sitemap + optional GSC) so the Audit is grounded in real intent
signals.
Want a done-for-you plan?
Request a Topical Drift Audit and get a prioritized remediation
blueprint + 30/60/90-day execution plan.
Topical Map Editor
Map basics — what is a topical map?
A topical map is the full blueprint of content your site needs to own
a subject. It defines which pages to build, how they relate to each
other, which ones are hubs, which are sections, and how they should
link — before you write a single word. The editor turns that blueprint
into a machine-readable graph with URLs, hierarchy, entity data, and a
full internal linking plan.
The Macro is the broad category your site or section
operates in — for example Dog Breeds. The Seed
is the specific topic you want to build topical authority around —
for example Golden Retriever. The generator uses both to
anchor the map and ensure every node is contextually correct for the
seed within the macro category.
Standalone page — has its own canonical URL and lives
as an independent page in the sitemap.
Section — has no URL. It is a heading and content block inside a parent page. Shown with a dashed border and muted colour in the editor.
Hub — a standalone page that also acts as a navigation centre for its child pages. It contains a summary section for each child and links out to all of them. Shown with a gold ring in the editor.
A node can be both a Section (summarised on its parent) and a Standalone (also has its own page).
Section — has no URL. It is a heading and content block inside a parent page. Shown with a dashed border and muted colour in the editor.
Hub — a standalone page that also acts as a navigation centre for its child pages. It contains a summary section for each child and links out to all of them. Shown with a gold ring in the editor.
A node can be both a Section (summarised on its parent) and a Standalone (also has its own page).
Six node types — each with its own colour and icon:
- Macro — the broadest category (amber)
- Seed — the primary topic the map is built around (purple)
- Topic — main pillar pages under the seed (cyan)
- Subtopic — supporting pages under a topic (green)
- FAQ — question-based sections or pages (orange)
- Comparison — versus or comparison content (pink)
Every node carries a 0–1 confidence score from the AI pipeline that
generated it. Scores below 0.70 suggest the topic may be too
tangential or too thin for its own content. Edit or remove
low-confidence nodes before briefing content.
Topical Map Generator
AI-powered map generation from macro + seed
Generating a topical map runs a multi-stage AI pipeline including SERP
analysis, entity extraction, and topic classification. An account lets
us save your maps so you can access them later, load them into the
editor, and track your usage against the fair-use limit of 3 maps per
month during the beta. The visual editor itself requires no account.
Seven stages:
- Normalize the seed — creates IDs, slugs, and establishes macro/seed context
- Discover candidate topics — AI topic expansion + SERP validation, PAA extraction, entity extraction
- Canonicalize topics — merges duplicates, separates concepts from wording variants
- Classify each topic — Section, FAQ, Standalone, or Hub (allows both Section + Standalone)
- Assign hierarchy + generate URLs — parent-child structure, tier levels, canonical URLs
- Generate internal linking — parent-child edges, related edges, FAQ attachment edges, anchor text + placement hints
- Assemble the final graph — nodes + edges JSON, loads into the editor automatically
Most maps complete in under 60 seconds. Larger seeds with many
subtopics or seeds that require deeper SERP analysis may take up to
2–3 minutes. A progress indicator is shown during generation. The
finished map opens automatically in the editor when the pipeline
completes.
Yes. The editor accepts any JSON in the node/edge format — whether
generated by our pipeline, ChatGPT, Claude, or your own system. Format
the output as the node/edge JSON, load it into the editor, and use the
visual interface to merge duplicates, classify pages, assign URLs, and
mark hubs. Both PascalCase and camelCase JSON are supported on import.
During the beta, up to 3 topical maps per month on
the free account (fair use). Higher limits will be available on paid
plans. The visual editor — loading JSON, editing, and exporting —
has no generation limits and requires no account.
Visual Map Editor
Editing, layouts, and navigation
The visual editor is completely free to use with no account required.
Load a JSON file, the built-in starter map, or start from scratch —
all in the browser. Editing and exporting (PNG, SVG, CSV, JSON) require
no server round-trips. Only the AI-powered Generator requires
a free account.
Force Graph — physics simulation. Best for exploring
the overall structure and spotting clusters. Drag nodes freely.
Horizontal Tree — left-to-right hierarchy. Tier level maps to column position. Best for reviewing URL depth and parent-child relationships.
Radial Tree — concentric rings. Seed is at the centre, Topics on the first ring, Subtopics on the second. Best for client presentations and exports.
Horizontal Tree — left-to-right hierarchy. Tier level maps to column position. Best for reviewing URL depth and parent-child relationships.
Radial Tree — concentric rings. Seed is at the centre, Topics on the first ring, Subtopics on the second. Best for client presentations and exports.
Click any node to select it. The Node Properties panel
opens on the right. Edit the label, slug, node type, URL role, hub role,
tier, parent, canonical URL, confidence score, entities, and aliases.
Click Apply Changes to save. The graph updates
immediately.
Yes. Use View → Hide Section-Only Nodes to remove all nodes
with URL Role = Section Only from the graph. The data is preserved —
toggle off to restore them. The stats bar shows visible/total
when filtering is active. This is useful for reviewing only the pages
that will have their own URLs.
- / — focus the search input
- F — fit view to all nodes
- Escape — close the edit panel or clear search
- Delete / Backspace — delete the selected node (when not typing)
The editor runs entirely in the browser using D3.js and handles maps
up to several hundred nodes comfortably. For very large maps (500+
nodes) switch to the Horizontal Tree or Radial Tree layout for better
performance. Use Hide Section-Only Nodes to reduce visual
clutter when reviewing large maps.
Nodes, edges & data
What lives inside a topical map
Every node contains:
- Unique ID, label, and URL slug
- Node type (Macro / Seed / Topic / Subtopic / FAQ / Comparison)
- Tier level (0 = Macro, 1 = Seed, 2 = Topic, 3+ = Subtopic)
- URL Role (Standalone = has its own page; Section = no URL)
- Hub Role (Hub Page or not)
- Canonical URL (auto-generated from parent URL + slug)
- Parent URL and parent node ID
- Confidence score (0–1)
- Entities (named organisations, concepts, tools, people)
- Aliases (keyword variants and search intent phrases)
Every edge contains:
- Source node ID and target node ID
- Edge type (Parent-Child / FAQ Attachment / Related)
- Weight (0–1, how strong the relationship is)
- Recommended anchor text
- Alternate anchor phrases
- Placement hint (body / hub section / FAQ block)
- Required vs optional status
The editor accepts any JSON with a
Nodes/nodes
array and an Edges/edges array. Both PascalCase
(pipeline output) and camelCase (editor export) are supported on import.
The format is graph-ready and compatible with Neo4j, D3.js, and any
graph database or visualisation tool.
When the Canonical URL field is empty, typing a label automatically
generates the URL from the parent's canonical URL and the slug:
Changing URL Role to Section Only clears the URL field. Changing the parent regenerates the URL if the field is empty. Manually typing a URL disables auto-generation for that node.
Parent: /golden-retriever/health/Slug: hip-dysplasiaResult: /golden-retriever/health/hip-dysplasia/Changing URL Role to Section Only clears the URL field. Changing the parent regenerates the URL if the field is empty. Manually typing a URL disables auto-generation for that node.
Export & workflow
Getting your map out of the editor
- PNG (Viewport) — exports what is currently visible at 4× screen resolution
- PNG (Full Map) — captures the entire graph at 4800px+ on the longer side
- SVG (Viewport / Full Map) — vector format with inlined styles, scalable to any size
- CSV — all node fields for content calendar spreadsheets (built in browser, no server)
- JSON (Copy / Save) — full graph for pipeline reload, CMS, Neo4j, or D3.js
- Generate the map (or load your JSON)
- Review and edit in the editor — fix labels, URLs, hub assignments
- Switch to Radial Tree layout and click Fit View
- Enable Rectangular Nodes + Full Labels
- Use Hide Section-Only Nodes to show only pages
- Export PNG — Full Map for the client presentation
- Export CSV for the content team brief
- Export JSON to save and reload later
Yes. Use File → Load JSON to load any JSON file from your
computer. The editor accepts the same format it exports, so you can
save and reload maps indefinitely. If you generated the map via the
Generator, it is also saved to your account and accessible from the
editor home page.
Use the editor to plan — before you publish. Use the
Drift Analyzer to monitor — after you publish. The full
lifecycle:
Plan (Map Editor) → Publish (your CMS) → Monitor (Drift Analyzer) → Fix (Drift report + action plan) → Re-plan (Map Editor)
Plan (Map Editor) → Publish (your CMS) → Monitor (Drift Analyzer) → Fix (Drift report + action plan) → Re-plan (Map Editor)
Pricing & limits
Plans and limits
The free plan includes:
- Drift Analyzer — unlimited crawls (fair use), up to 500 pages per site
- Map Generator — up to 3 topical maps per month (beta, free account required)
- Visual Map Editor — completely free, no account required, no limits
Locked on the Free plan (Drift Analyzer)
Locked
- Affected URL lists for every issue
- Per-page diagnostics (exact URLs, CTR gaps, missed clicks)
- Full issue drilldowns and complete export tables
- Advanced recommendations and remediation blueprints
You can run crawls whenever you need them — no artificial monthly
limits. "Fair use" means reasonable usage for legitimate site analysis
(not abuse or automated mass-recrawling). Most teams run 1–2 crawls
per site per month, but you can crawl more often while iterating on
fixes. Drift detection should match your workflow, not an arbitrary
calendar.
Yes. You can add extra pages as add-on blocks or crawl large sites in
phases (e.g., blog section first, then product/service pages). For
agencies managing larger portfolios, higher page limits and multi-site
plans are available. See the
pricing page
for details.
Higher generation limits will be available on paid plans. During the
beta, the free account includes 3 maps per month. The visual editor
itself — loading, editing, and exporting JSON — has no generation
limits and requires no account, so you can continue working on
existing maps freely.
Yes. You can crawl multiple sites (up to 500 pages each, unlimited
crawls with fair use). Plans are based on number of sites — with an
agency tier designed for multi-client workflows.
Yes. A Topical Drift Audit is a done-for-you
remediation blueprint: what to fix first, consolidation decisions,
internal linking strategy, and a 30/60/90-day plan.
Request an Audit
.
Drift detection should match your workflow — not arbitrary calendar
limits. You may want to run a baseline, implement changes, and validate
quickly. Unlimited crawls let you iterate without being penalized, while
fair use prevents abuse.
It depends on where you are in your content lifecycle:
- Starting a new site or topic cluster? Use the Map Generator to build your architecture before publishing. Then use the Visual Editor to review and refine.
- Have an existing site? Use the Drift Analyzer first to see where your current content has drifted. Then use the Map Editor to re-plan the structure.
- Both? Generate the map → publish → monitor with the Drift Analyzer → fix → re-plan. That is the full content authority lifecycle.
Ready to get started?
Run a free drift crawl, build a topical map, or request a
done-for-you Audit. No credit card required for free tools.