Migration · Glossary

Crawl budget

Last updated June 29, 2026 · by Tal Gerafi

Crawl budget is the number of pages a search engine like Google will crawl on your site within a given window. It is set by how fast your server responds and how much the crawler wants your URLs.

Crawl budget is how much attention a search engine spends on your site: how many URLs it fetches before it moves on. Google sets it from two things together — how fast and reliably your server answers (crawl capacity), and how much it actually wants your pages (crawl demand). Small sites rarely hit a limit; large ones with thousands of URLs can.

How does crawl budget work?

Google's crawler, Googlebot, asks your server for pages on a schedule it decides. Two forces shape that schedule. The first is crawl capacity: if your server responds quickly and returns clean 200s, Googlebot crawls more; if responses are slow or throw errors, it backs off to avoid overloading you. The second is crawl demand: popular, frequently-updated, well-linked URLs get crawled often, while stale or low-value ones get visited rarely or not at all.

The budget gets wasted when the crawler spends fetches on URLs that should not be crawled — endless filter combinations, session IDs, soft 404s, redirect chains, or duplicate pages that need a canonical URL to consolidate. Every wasted fetch is one your important page did not get. AI crawlers behave similarly, so the same waste also costs you with an AI crawler like GPTBot or PerplexityBot.

Why does crawl budget matter for B2B sites?

Most B2B and SaaS sites are too small to ever strain their budget — a few hundred clean pages crawl fine. Google's own documentation states that crawl budget is not something most sites need to worry about, and Search Advocate John Mueller has echoed this for smaller sites.

It starts to matter when a site grows large or messy: faceted product/doc filters, auto-generated tag archives, or index bloat from thin pages. The classic trigger is a migration. When you move a WordPress site to Next.js, old plugin URLs, paginated archives, and duplicate paths can bloat the crawlable set overnight. Cleaning that — solid redirects, accurate sitemaps, no-indexing junk — points the crawler at the pages that earn rankings instead of the ones that do not.

Go deeper

How Do You Migrate WordPress to Next.js Without Losing SEO? →