Processing...
Processing...

Topical Drift Analyzer & Topical Map Editor — plans, crawls, maps, and audits

This FAQ covers both tools. The Topical Drift Analyzer finds pages that don't belong — where internal link meaning, on-page content, and Google Search Console queries don't line up. The Topical Map Editor lets you build, edit, and export a complete topical authority blueprint — visually, before you write a single page.

This page explains what you get on each plan, how crawls work, how the map generator and editor work, and when a Topical Drift Audit makes sense.

Quick links
  • Free drift crawl — up to 500 pages, unlimited crawls
  • Topical Map Generator — 3 maps/month (beta, free account)
  • Visual Map Editor — free, no account required
  • Interactive UMAP visualization + clusters + action plan
  • Advanced diagnostics unlocked on paid plans
  • Optional: Topical Drift Audit (done-for-you blueprint)

Topical drift 101

Topical drift is when a page slowly stops representing the topic and intent it used to win for. It often happens after months of edits, new sections, internal linking changes, or "helpful" expansions that pull the page off-center. We measure drift using semantic distance — lower means tighter topical alignment; higher means more drift.

Content decay is typically about freshness, SERP changes, stronger competitors, or better alternatives showing up. Drift is about meaning mismatch: the page content, the internal link context pointing at it, and the queries it actually attracts stop agreeing. Many pages "decay" because they first drift. Our report shows you where meaning is off and what to fix.

It's when an internal link's meaning in context doesn't match the destination page's meaning. Most tools stop at anchor text. We analyze the anchor + surrounding text + the container/heading context to estimate what the link is "claiming," then compare that to the destination page embedding. High mismatch can send confusing topical signals.

Actual distances (default) are cosine distances computed from embeddings. We use these for trend tracking over time because the scale is stable across crawls.

Normalized distances rescale a site's distances into a 0–1 range for easier relative comparison within a site and for composite scoring. Normalized mode is helpful when you want percentile-style zones or you're weighting multiple signals (like links + engagement).

They are designed to work together across the content lifecycle:
  • Topical Map Editor — plan your architecture before publishing. Build every page, URL, hub, section, FAQ, and internal link in a visual graph.
  • Topical Drift Analyzer — monitor the architecture after publishing. Crawl your live site to find which pages have drifted away from the plan.
Plan → Publish → Monitor → Fix → Re-plan.

What we use, what we store, and what we don't

We use your sitemap URL inventory, cleaned main page content (headings + body text), internal link context (anchor + surrounding text + container/heading context), and (optionally) Google Search Console performance to ground the analysis in real query intent. We generate embeddings from the cleaned text and use UMAP to visualize semantic neighborhoods.

The crawl starts from your sitemap. In most cases that's enough because it reflects your intended indexable inventory. If you have important pages not in the sitemap, add them (recommended) or include them via upload/override in the app. We fetch HTML, extract main content, and generate embeddings for each page.

Read-only access for the property you want to analyze. We use clicks/impressions/position signals to identify pages whose query reality is drifting away from what their content would suggest they should attract. GSC is optional, but it makes drift detection and prioritization far more accurate.

We store analysis results and derived representations (embeddings, scores, aggregates, and report outputs). We avoid storing raw HTML whenever possible. Some features may store small extracted excerpts needed to render the report — not full pages. You can delete analysis data at any time. We do not sell or share your data.

How the drift signal is computed

Embeddings are vectors that represent semantic meaning. We use them to:
  • Measure page-to-page similarity (cosine similarity) and distance
  • Group pages into semantic clusters (k-means on embeddings)
  • Project to 2D for visualization (UMAP)
  • Compare internal link context vs destination page meaning
  • Calculate distance from your topical centre (centroid embedding)
UMAP is used for visualization only. Similarity, distance, and SDI are computed in the original embedding space.

Raw HTML includes nav, footers, widgets, "related posts," and template repetition. Embedding all of that makes vectors represent your template, not your topic. Main-content extraction keeps what matters (headings H1–H6, body text, lists, core containers) and reduces boilerplate. That makes clustering cleaner and drift detection more precise.

SDI is a composite score designed for prioritization. The default weighting is:
60% semantic distance (topical alignment)
30% link penalty (internal linking weakness)
10% engagement penalty (GSC performance signals)

You can customize these weights depending on your workflow (pure content audit, link-first fixes, etc.). SDI is most useful in normalized mode, where scores are comparable within a site.

Drift changes after meaningful updates — publishing new pages, editing sections, restructuring internal links — not hour-to-hour. Daily crawls add noise without changing decisions. GSC metrics are delayed and fluctuate day-to-day, so daily drift dashboards often create false alarms. Weekly or monthly crawls create a stable rhythm that's easier to act on. Run a quick before/after crawl during active optimization sprints; monthly is the sweet spot otherwise.

In Actual Distance mode (default):
Core: ≤ 0.300 — excellent topical alignment
Focus: 0.300–0.500 — strong alignment, on topic
Extended: 0.500–0.700 — moderate drift, review recommended
Peripheral: ≥ 0.700 — significant drift, needs attention

These are default heuristics and will be tuned as we calibrate across different site types.

In Normalized mode: Zones are percentile-based (0–0.3 = best 30%, etc.) for relative comparison within your site.

Interactive radial map and semantic projection

UMAP (Uniform Manifold Approximation and Projection) reduces high-dimensional embeddings to 2D while preserving semantic neighborhoods. Pages positioned near each other represent similar meaning — which often indicates linking opportunities and shared topical intent. We use those 2D coordinates to visualize relationships and to derive a consistent "around-the-circle" semantic ordering.

Radius (distance from center): Shows drift. Pages farther from your topical center are more off-topic. Toggle between actual distances (stable across crawls) or normalized (0–1 for relative ranking).

Angle (position around the circle): Uses semantic projection by default (derived from UMAP's 2D coordinates), so pages that are similar in meaning sit near each other. You can also switch to group by cluster, distribute evenly, or group by problem type.

Linking opportunities are pairs of pages that are semantically close but don't currently link. The tool surfaces high-confidence candidates using proximity thresholds plus intent and context signals. Adding contextual internal links between close neighbors helps search engines understand your structure and can strengthen topical authority.

Yes. You can:
  • Toggle between actual/normalized distances
  • Choose angle mode (semantic, by cluster, uniform, by problem type)
  • Color by cluster, zone, SDI score, or drift severity
  • Adjust node size (mapped to traffic)
  • Show/hide labels (none / all / drifting only)
  • Filter by cluster
  • Zoom, pan, and click nodes to open pages
  • Export as PNG for sharing

No. UMAP only affects the layout of the visualization (where points appear on the 2D map). Drift scoring uses cosine distance computed in the original embedding space. The visual positions are for human readability only.

What you get — and what to do with it

You get:
  • Drift report with severity rankings
  • Semantic clusters and hub structure
  • Interactive radial map with UMAP + distance zones
  • Internal link meaning mismatch list
  • Linking opportunities (similar pages that should link)
  • Zone distribution dashboard
  • CSV exports for all core datasets
  • Prioritized checklist and action plan

The fastest wins are usually:
  • Adjusting link context (anchor + nearby text) to match destination pages
  • Moving links into better semantic sections
  • Re-centering headings/content to match intent
  • Adding internal links between semantically close pages
  • Consolidating overlapping/cannibalizing pages
  • Strengthening hub pages with missing entities or sections
  • Reviewing peripheral pages for consolidation, re-scope, or redirect

With actual distance mode, you can track whether fixes reduce drift:
  • Run a baseline crawl → note distances for problematic pages
  • Implement fixes (content updates, link changes)
  • Run a follow-up crawl (monthly is typical)
  • Compare distances and zone movement over time
Lower distances = tighter topical focus. Moving from 0.650 to 0.480 indicates measurable improvement in topical alignment.

Yes — I can review your report and turn it into a concrete implementation plan or handle execution directly. That includes prioritization, rewriting link context, identifying consolidation opportunities, and mapping linking strategies based on semantic neighborhoods. Contact me.

Under the hood

We use the same embedding model and configuration within a crawl so comparisons are valid. If we change the embedding model or version in the future we'll label that so historical comparisons stay honest.

We cluster page embeddings into topic groups using k-means. The goal is to reveal hub structure: which pages form cohesive neighborhoods, which pages are off-cluster, and which pages behave like bridges. The report uses clusters to summarize your content architecture and prioritize fixes by group.

UMAP is fast and does a strong job preserving semantic neighborhoods at scale (hundreds of pages). t-SNE can be slow and can distort global structure; PCA is fast but often collapses meaningful local similarity. UMAP gives a practical balance for "what should link to what?" visualization.

Yes. You can export CSV for:
  • Page drift data (URL, distance, zone, cluster, metrics)
  • Cluster membership and statistics
  • Internal link meaning mismatches with similarity scores
  • UMAP coordinates and semantic positioning
  • Linking opportunities
Available on paid plans only.

Mostly yes — and we can make it reproducible.
Embeddings, similarity, and distance calculations are deterministic for the same input and model. Clustering and UMAP can vary slightly unless we fix random seeds. With fixed seeds, repeated crawls on unchanged content produce the same cluster structure and a stable map layout.

Done-for-you remediation blueprint

A Topical Drift Audit is a done-for-you remediation blueprint . I review your crawl outputs, validate the highest-impact issues, and deliver a prioritized plan: what to fix first, consolidation decisions, internal linking strategy, and a 30/60/90-day execution plan.

Choose an Audit if you want a clear execution sequence with fewer "what do we do next?" decisions. If you prefer to execute in-house and just need visibility + tracking, use the tool and upgrade to unlock affected URLs and per-page diagnostics.

Typically includes:
  • Executive summary + highest-impact wins
  • Prioritized roadmap (what to fix first + why)
  • Consolidation guidance (merge / trim / redirect decisions)
  • Internal linking strategy (hubs, anchors, contextual placement)
  • 30/60/90-day implementation plan

It helps, but it's not required. If you already have a crawl, include it (or exports) and your goals. If not, I'll tell you exactly what to run (sitemap + optional GSC) so the Audit is grounded in real intent signals.
Want a done-for-you plan?
Request a Topical Drift Audit and get a prioritized remediation blueprint + 30/60/90-day execution plan.

Map basics — what is a topical map?

A topical map is the full blueprint of content your site needs to own a subject. It defines which pages to build, how they relate to each other, which ones are hubs, which are sections, and how they should link — before you write a single word. The editor turns that blueprint into a machine-readable graph with URLs, hierarchy, entity data, and a full internal linking plan.

The Macro is the broad category your site or section operates in — for example Dog Breeds. The Seed is the specific topic you want to build topical authority around — for example Golden Retriever. The generator uses both to anchor the map and ensure every node is contextually correct for the seed within the macro category.

Standalone page — has its own canonical URL and lives as an independent page in the sitemap.

Section — has no URL. It is a heading and content block inside a parent page. Shown with a dashed border and muted colour in the editor.

Hub — a standalone page that also acts as a navigation centre for its child pages. It contains a summary section for each child and links out to all of them. Shown with a gold ring in the editor.

A node can be both a Section (summarised on its parent) and a Standalone (also has its own page).

Six node types — each with its own colour and icon:
  • Macro — the broadest category (amber)
  • Seed — the primary topic the map is built around (purple)
  • Topic — main pillar pages under the seed (cyan)
  • Subtopic — supporting pages under a topic (green)
  • FAQ — question-based sections or pages (orange)
  • Comparison — versus or comparison content (pink)

Every node carries a 0–1 confidence score from the AI pipeline that generated it. Scores below 0.70 suggest the topic may be too tangential or too thin for its own content. Edit or remove low-confidence nodes before briefing content.

AI-powered map generation from macro + seed

Generating a topical map runs a multi-stage AI pipeline including SERP analysis, entity extraction, and topic classification. An account lets us save your maps so you can access them later, load them into the editor, and track your usage against the fair-use limit of 3 maps per month during the beta. The visual editor itself requires no account.

Seven stages:
  1. Normalize the seed — creates IDs, slugs, and establishes macro/seed context
  2. Discover candidate topics — AI topic expansion + SERP validation, PAA extraction, entity extraction
  3. Canonicalize topics — merges duplicates, separates concepts from wording variants
  4. Classify each topic — Section, FAQ, Standalone, or Hub (allows both Section + Standalone)
  5. Assign hierarchy + generate URLs — parent-child structure, tier levels, canonical URLs
  6. Generate internal linking — parent-child edges, related edges, FAQ attachment edges, anchor text + placement hints
  7. Assemble the final graph — nodes + edges JSON, loads into the editor automatically

Most maps complete in under 60 seconds. Larger seeds with many subtopics or seeds that require deeper SERP analysis may take up to 2–3 minutes. A progress indicator is shown during generation. The finished map opens automatically in the editor when the pipeline completes.

Yes. The editor accepts any JSON in the node/edge format — whether generated by our pipeline, ChatGPT, Claude, or your own system. Format the output as the node/edge JSON, load it into the editor, and use the visual interface to merge duplicates, classify pages, assign URLs, and mark hubs. Both PascalCase and camelCase JSON are supported on import.

During the beta, up to 3 topical maps per month on the free account (fair use). Higher limits will be available on paid plans. The visual editor — loading JSON, editing, and exporting — has no generation limits and requires no account.

Editing, layouts, and navigation

The visual editor is completely free to use with no account required. Load a JSON file, the built-in starter map, or start from scratch — all in the browser. Editing and exporting (PNG, SVG, CSV, JSON) require no server round-trips. Only the AI-powered Generator requires a free account.

Force Graph — physics simulation. Best for exploring the overall structure and spotting clusters. Drag nodes freely.

Horizontal Tree — left-to-right hierarchy. Tier level maps to column position. Best for reviewing URL depth and parent-child relationships.

Radial Tree — concentric rings. Seed is at the centre, Topics on the first ring, Subtopics on the second. Best for client presentations and exports.

Click any node to select it. The Node Properties panel opens on the right. Edit the label, slug, node type, URL role, hub role, tier, parent, canonical URL, confidence score, entities, and aliases. Click Apply Changes to save. The graph updates immediately.

Yes. Use View → Hide Section-Only Nodes to remove all nodes with URL Role = Section Only from the graph. The data is preserved — toggle off to restore them. The stats bar shows visible/total when filtering is active. This is useful for reviewing only the pages that will have their own URLs.

  • / — focus the search input
  • F — fit view to all nodes
  • Escape — close the edit panel or clear search
  • Delete / Backspace — delete the selected node (when not typing)

The editor runs entirely in the browser using D3.js and handles maps up to several hundred nodes comfortably. For very large maps (500+ nodes) switch to the Horizontal Tree or Radial Tree layout for better performance. Use Hide Section-Only Nodes to reduce visual clutter when reviewing large maps.

What lives inside a topical map

Every node contains:
  • Unique ID, label, and URL slug
  • Node type (Macro / Seed / Topic / Subtopic / FAQ / Comparison)
  • Tier level (0 = Macro, 1 = Seed, 2 = Topic, 3+ = Subtopic)
  • URL Role (Standalone = has its own page; Section = no URL)
  • Hub Role (Hub Page or not)
  • Canonical URL (auto-generated from parent URL + slug)
  • Parent URL and parent node ID
  • Confidence score (0–1)
  • Entities (named organisations, concepts, tools, people)
  • Aliases (keyword variants and search intent phrases)

Every edge contains:
  • Source node ID and target node ID
  • Edge type (Parent-Child / FAQ Attachment / Related)
  • Weight (0–1, how strong the relationship is)
  • Recommended anchor text
  • Alternate anchor phrases
  • Placement hint (body / hub section / FAQ block)
  • Required vs optional status

The editor accepts any JSON with a Nodes/nodes array and an Edges/edges array. Both PascalCase (pipeline output) and camelCase (editor export) are supported on import. The format is graph-ready and compatible with Neo4j, D3.js, and any graph database or visualisation tool.

When the Canonical URL field is empty, typing a label automatically generates the URL from the parent's canonical URL and the slug:

Parent: /golden-retriever/health/
Slug: hip-dysplasia
Result: /golden-retriever/health/hip-dysplasia/

Changing URL Role to Section Only clears the URL field. Changing the parent regenerates the URL if the field is empty. Manually typing a URL disables auto-generation for that node.

Getting your map out of the editor

  • PNG (Viewport) — exports what is currently visible at 4× screen resolution
  • PNG (Full Map) — captures the entire graph at 4800px+ on the longer side
  • SVG (Viewport / Full Map) — vector format with inlined styles, scalable to any size
  • CSV — all node fields for content calendar spreadsheets (built in browser, no server)
  • JSON (Copy / Save) — full graph for pipeline reload, CMS, Neo4j, or D3.js

  1. Generate the map (or load your JSON)
  2. Review and edit in the editor — fix labels, URLs, hub assignments
  3. Switch to Radial Tree layout and click Fit View
  4. Enable Rectangular Nodes + Full Labels
  5. Use Hide Section-Only Nodes to show only pages
  6. Export PNG — Full Map for the client presentation
  7. Export CSV for the content team brief
  8. Export JSON to save and reload later

Yes. Use File → Load JSON to load any JSON file from your computer. The editor accepts the same format it exports, so you can save and reload maps indefinitely. If you generated the map via the Generator, it is also saved to your account and accessible from the editor home page.

Use the editor to plan — before you publish. Use the Drift Analyzer to monitor — after you publish. The full lifecycle:

Plan (Map Editor) → Publish (your CMS) → Monitor (Drift Analyzer) → Fix (Drift report + action plan) → Re-plan (Map Editor)

Plans and limits

The free plan includes:
  • Drift Analyzer — unlimited crawls (fair use), up to 500 pages per site
  • Map Generator — up to 3 topical maps per month (beta, free account required)
  • Visual Map Editor — completely free, no account required, no limits
Locked on the Free plan (Drift Analyzer)
Locked
  • Affected URL lists for every issue
  • Per-page diagnostics (exact URLs, CTR gaps, missed clicks)
  • Full issue drilldowns and complete export tables
  • Advanced recommendations and remediation blueprints

You can run crawls whenever you need them — no artificial monthly limits. "Fair use" means reasonable usage for legitimate site analysis (not abuse or automated mass-recrawling). Most teams run 1–2 crawls per site per month, but you can crawl more often while iterating on fixes. Drift detection should match your workflow, not an arbitrary calendar.

Yes. You can add extra pages as add-on blocks or crawl large sites in phases (e.g., blog section first, then product/service pages). For agencies managing larger portfolios, higher page limits and multi-site plans are available. See the pricing page for details.

Higher generation limits will be available on paid plans. During the beta, the free account includes 3 maps per month. The visual editor itself — loading, editing, and exporting JSON — has no generation limits and requires no account, so you can continue working on existing maps freely.

Yes. You can crawl multiple sites (up to 500 pages each, unlimited crawls with fair use). Plans are based on number of sites — with an agency tier designed for multi-client workflows.

Yes. A Topical Drift Audit is a done-for-you remediation blueprint: what to fix first, consolidation decisions, internal linking strategy, and a 30/60/90-day plan. Request an Audit .

Drift detection should match your workflow — not arbitrary calendar limits. You may want to run a baseline, implement changes, and validate quickly. Unlimited crawls let you iterate without being penalized, while fair use prevents abuse.

It depends on where you are in your content lifecycle:
  • Starting a new site or topic cluster? Use the Map Generator to build your architecture before publishing. Then use the Visual Editor to review and refine.
  • Have an existing site? Use the Drift Analyzer first to see where your current content has drifted. Then use the Map Editor to re-plan the structure.
  • Both? Generate the map → publish → monitor with the Drift Analyzer → fix → re-plan. That is the full content authority lifecycle.
Ready to get started?
Run a free drift crawl, build a topical map, or request a done-for-you Audit. No credit card required for free tools.