Diagnose

Every issue, ranked by the traffic it affects

CrawlX runs 65+ technical checks across 13 categories, then sorts every finding by estimated traffic impact and groups it by root cause — so the top of the list is always the highest-leverage work, not the longest list of warnings.

Start crawling free All features

Find an issue… Sort: traffic impact

Critical5

IssuePriorityPagesAssigned

Self-referencing canonical → httpHigh4,118VKJD

Redirect chains (3+ hops)High1,204AM

Noindex on paginated archivesHigh906VKRP

Slow LCP on mobile templatesMedium862RP

Near-duplicate content (SimHash)Medium538JD

Warnings17

Fixed23

Traffic-impact scoring

The highest-leverage fix is always at the top

Most crawlers sort by raw severity, so a thousand cosmetic warnings bury the one change that matters. CrawlX scores every issue by the traffic it actually affects — weighted with real demand from Search Console and GA4 — so triage starts with the work that moves rankings.

Traffic-impact score Search Console + GA4 weighting Sorted by pages affected Severity as a tiebreaker

Issues by impact

7,839 pages affected

Self-referencing canonical4,118 pages

Redirect chains (3+ hops)1,204 pages

Noindex on archives906 pages

Slow LCP (mobile)862 pages

Near-duplicate content538 pages

Broken internal links211 pages

Root-cause grouping & full coverage

One template behind 4,000 errors? Fix it once

CrawlX reads the full crawl graph and collapses thousands of identical errors into the single template, directory, or rule that caused them — so you fix the root, not the symptoms. Underneath sits the full check suite: 65+ checks across 13 categories, every page, every crawl.

Group by template Group by directory 65+ checks 13 categories

Check coverage

13 categories · 65+ checks

HTTP status & redirects6

Meta data7

Headings4

Content (incl. SimHash)6

Images5

Links6

Indexability & canonicals7

Structured data5

hreflang4

Pagination3

Security5

Performance / Core Web Vitals5

Resources5

Core Web Vitals & crawl budget

Real performance data, scored where it matters

CrawlX measures Largest Contentful Paint, Interaction to Next Paint, and Cumulative Layout Shift per template against Google's good and needs-work thresholds, pulled from PageSpeed Insights. Then crawl-budget analysis separates the low-value URLs wasting bot time from the high-impact pages that deserve it.

LCP INP CLS PageSpeed Insights Crawl-budget analysis

Core Web Vitals

mobile · field data

LCP

2.1s

≤ 2.5s good

INP

240ms

≤ 200ms good

CLS

0.06

≤ 0.10 good

Low-value URLs

31%

of crawl budget

High-impact URLs

69%

of crawl budget

What triage gives you

From a wall of warnings to a ranked plan

Six capabilities turn a raw crawl into a prioritised list your team can actually work through — fewer rows, more leverage.

Traffic-impact score

Every issue is sorted by the traffic it actually affects — weighted with Search Console and GA4 demand — not by raw severity, so the top of the list is the work that moves rankings.

Root-cause grouping

One broken template behind thousands of errors collapses into a single fix. Group by template, directory, or rule and resolve the cause once instead of chasing symptoms.

Core Web Vitals

LCP, INP, and CLS measured per template against Google's good / needs-work thresholds, pulled from PageSpeed Insights so field and lab data line up.

Duplicate detection (SimHash)

Near-duplicate and thin pages are clustered with SimHash fingerprints — catching boilerplate and templated content that exact-match checks miss.

Crawl-budget analysis

See where bots burn budget. Low-value URLs — faceted dupes, infinite params, redirect chains — are separated from high-impact pages that deserve the crawl.

Indexability & canonicals

Robots directives, canonical targets, and noindex conflicts resolved per URL, so the pages you want in the index are indexable — and the rest aren't wasting equity.

How triage works

Detect, score, group

Three passes turn 40,000 raw findings into a short, ranked list of root causes.

Detect

Every crawled URL is run through 65+ checks across 13 categories — status, meta, headings, content, images, links, indexability, schema, hreflang, pagination, security, performance, and resources.

65+ checks13 categoriesPer-URL results

Score

Each finding is scored by estimated traffic impact, weighted with Search Console and GA4 demand and the number of pages affected — so severity becomes a tiebreaker, not the sort order.

Traffic-impact scoreSC + GA4 weightingPages affected

Group

Identical issues collapse into the one template, directory, or rule that caused them. Near-duplicate pages are clustered with SimHash, so you fix the root cause once and clear thousands of rows.

Root-cause groupingSimHash clusteringFix once

From triage to fix

A ranked list is the start — CrawlX opens the PR

Triage tells you exactly what to fix first. For high-impact issues, CrawlX can draft the change and open it as a GitHub pull request you review and merge — it never auto-merges and never pushes to your default branch. AI is bring-your-own-key (Anthropic Claude or OpenAI GPT-4o).

AI-drafted diffs Opens GitHub PRs Human approves Never pushes to main

Top fix this crawl

#1 by impact

Self-referencing canonical → http

Root cause: templates/product.liquid — one template behind 4,118 affected pages.

High impact4,118 pages1 root cause

Draft fix ready Open PR

Explore more features

The rest of the CrawlX loop

Triage is one step. See how crawling, AI fixes, the toolkit, integrations, and reports fit together.

Stop reading reports.
Start with the fix that matters.

Run a crawl and get a ranked, root-grouped plan in the time it takes to read this.

Start crawling free See pricing

The highest-leverage fix is always at the top

Traffic-impact score Search Console + GA4 weighting Sorted by pages affected Severity as a tiebreaker

One template behind 4,000 errors? Fix it once

Group by template Group by directory 65+ checks 13 categories

Real performance data, scored where it matters

LCP INP CLS PageSpeed Insights Crawl-budget analysis

A ranked list is the start — CrawlX opens the PR

AI-drafted diffs Opens GitHub PRs Human approves Never pushes to main

Every issue, ranked by the traffic it affects

The highest-leverage fix is always at the top

Issues by impact

One template behind 4,000 errors? Fix it once

Check coverage

Real performance data, scored where it matters

Core Web Vitals

From a wall of warnings to a ranked plan

Traffic-impact score

Root-cause grouping

Core Web Vitals

Duplicate detection (SimHash)

Crawl-budget analysis

Indexability & canonicals

Detect, score, group

Detect

Score

Group

A ranked list is the start — CrawlX opens the PR

Top fix this crawl

The rest of the CrawlX loop

Cloud crawl engine

AI: fixes, content & schema

Technical-SEO toolkit

Integrations & API

Reports & collaboration

All features

Stop reading reports.Start with the fix that matters.

Every issue, ranked by the traffic it affects

The highest-leverage fix is always at the top

Issues by impact

One template behind 4,000 errors? Fix it once

Check coverage

Real performance data, scored where it matters

Core Web Vitals

From a wall of warnings to a ranked plan

Traffic-impact score

Root-cause grouping

Core Web Vitals

Duplicate detection (SimHash)

Crawl-budget analysis

Indexability & canonicals

Detect, score, group

Detect

Score

Group

A ranked list is the start — CrawlX opens the PR

Top fix this crawl

The rest of the CrawlX loop

Cloud crawl engine

AI: fixes, content & schema

Technical-SEO toolkit

Integrations & API

Reports & collaboration

All features

Stop reading reports.Start with the fix that matters.

Stop reading reports.
Start with the fix that matters.

Stop reading reports.
Start with the fix that matters.