HTTP 410 vs 404 for SEO and Crawl Budget

status 404 vs 410

Pull any large site's crawl logs and you will find a meaningful slice of bot activity spent on URLs that died months ago. Googlebot, Bingbot, and a growing crowd of AI crawlers keep returning to pages you deleted, rechecking whether they are still gone. The lever that controls how often they bother is the HTTP status code you return when they ask, and the choice between 404 and 410 matters far less for how fast a page leaves the index than for how much crawl it keeps costing you afterward. We pull this thread on almost every technical audit, because recovered crawl budget is some of the cheapest performance in SEO.

The reason it gets ignored is that the topic is usually told as a myth, that 410 deindexes in days while 404 lingers for months, so you should rush to convert every dead page. The deindex-speed half of that is overstated. The crawl-frequency half is real, measurable, and the actual reason 410 earns a place on a large site.

What 410 and 404 actually tell a crawler

A 404 means the server could not find the resource, and it says nothing about whether that is permanent. The page could be a typo, a deploy glitch, or something that will be back tomorrow. A 410 means the resource is gone deliberately and permanently, with no forwarding address. That difference in certainty is what the crawler acts on, because the two codes describe different levels of commitment on your part.

The cleanest way to picture it comes from Kevin Indig, who framed it while reviewing a controlled experiment on the question. 410 are treated like 301s and 404 like 302s, he said, which makes sense because a search engine expects a 404 might turn back into a 200 at any moment. A 404 looks provisional, so the crawler keeps the URL on a return schedule to check whether you fixed it. A 410 looks final, so it stops expecting a comeback and winds its interest down faster.

The deindex gap is small, the recrawl gap is large

Start with what Google actually says, because it is more modest than the blog posts. John Mueller has stated that a 410 will sometimes fall out a little bit faster than a 404, but usually on the order of a couple of days, and that over the mid to long term the two are treated the same, with both URLs dropped from the index. If your only goal is to get a page deindexed, the choice barely moves the needle, because both get there.

The measurable difference shows up in crawl frequency, not deindex speed. The agency Reboot ran a controlled experiment that split a sample of indexed URLs with no external links and matched internal links, set half to 404 and half to 410, then watched Googlebot for more than three months across 350,000 rows of log and Search Console data. Their result was clean. 404 URLs were crawled 49.6% more often than 410 URLs, a difference confirmed as statistically significant to 95% confidence.

Read that as a crawl-budget figure, because that is what it is. Tell an engine a dead page is permanently gone with a 410, and it comes back to recheck that page roughly half as often as it would a 404. On a handful of dead pages the difference rounds to nothing. On a large site retiring thousands of URLs, halving the recrawl rate across all of them frees a real share of budget the crawler was spending to confirm pages you already knew were dead, and that budget flows back to your live, revenue-earning pages.

This is not only a Google problem

The post that got this topic flagged on most sites treats it as a Google-only question, which misses where a lot of the value now sits. Bing removes URLs that return either a 404 or a 410, and its guidance treats 410 as the clearer, stronger signal for getting a deleted page out of the index rather than leaving it on a recheck loop. Bingbot has its own crawl budget to spend or waste on your dead URLs, exactly like Googlebot does, so the same recrawl logic applies on Microsoft's side of the web.

There is a faster path on Bing specifically. Bing detects a 404 or 410 change much sooner when you submit the affected URL through IndexNow, so the clean pattern for a retired page is to return the 410 and then push that URL through IndexNow in the same step. We cover the protocol in detail in our IndexNow implementation guide, and dead-URL cleanup is one of its most underused jobs.

The AI layer inherits all of this. Copilot, DuckDuckGo, and the web search behind ChatGPT draw on the Bing index, so a retired page that lingers in Bing can keep surfacing inside AI answers long after you killed it. Returning a 410 and announcing it through IndexNow is how you get a stale or wrong page out of the indexes that feed those assistants, not just out of the blue links. For a brand that cares about how it shows up in AI answers, that is reason enough to handle dead URLs deliberately rather than letting them 404 by default.

When to use 301, 410, 404, or 503

The decision is never 410 versus 404 in isolation. It is a four-way choice, and matching the right code to the situation is the actual skill.

Status	Use when	Why
`301`	There is a genuine equivalent page	Redirect and keep the link equity rather than discarding it with a 404 or 410
`410`	Permanently gone, no equivalent, and you are certain	Retired content, discontinued SKUs, hacked spam URLs. Halves the recrawl rate
`404`	Gone, but you are unsure or it may return	Engines are fine with 404 and keep a return schedule in case it comes back
`503`	Temporarily unavailable, not deleted	Maintenance or an outage, tells the crawler to come back rather than deindex

The edge cases are where teams get it wrong. A product that is out of stock but coming back should stay a live 200 with stock messaging, not a 410, because you want to keep its ranking for when inventory returns. A product line that is genuinely discontinued is a clean 410. Hacked URLs that an attacker injected should be 410'd and pushed through IndexNow so they leave every index as fast as possible. In a migration, an old URL with a real equivalent is a 301, while an old URL with nothing to map to is a 410. The one option that is always wrong is the soft 404, a dead page that returns a 200 with a "sorry, not found" message, because the engine cannot tell it is dead, keeps it eligible, and wastes crawl on it forever. If you fix nothing else from this article, find your soft 404s first.

Deploying 410 at scale

The reason 410 gets skipped is that returning it in bulk takes more effort than a 404, which most platforms serve by default. On Apache, a pattern of dead URLs can be sent to 410 with a rewrite rule or a Redirect gone directive.

Redirect gone /retired-product-page
RewriteRule ^category/discontinued/ - [G]

On nginx, you return it directly in a location block, which is the most direct of the three because no rewrite engine is involved.

location /retired/ { return 410; }

On a headless or edge setup, the cleanest approach for a list of dead URLs is a CDN worker that checks the request path against a gone-list and returns 410 before the request ever reaches your origin. It is also the easiest place to fire the IndexNow ping at the same time, so the retirement and the notification happen in one place.

// Cloudflare Worker
const GONE = new Set(["/old-sku-123", "/retired/guide"]);
export default {
  fetch(req) {
    const path = new URL(req.url).pathname;
    if (GONE.has(path)) return new Response("Gone", { status: 410 });
    return fetch(req);
  }
}

Two platform caveats are worth knowing before you promise a client this. WordPress does not emit 410 on its own, but the Redirection plugin and most redirect managers let you set a 410 per URL or pattern, and you can do it in .htaccess as above. Shopify is the hard one, because it does not let you return a native 410, it serves 404 for removed products and pages, and there is no clean setting to change that. On Shopify you are mostly left with 404, which, given how small the deindex-speed difference actually is, is genuinely fine. Spend the 410 effort on the platforms where it is cheap to do at scale.

Finding the candidates across Google and Bing

You do not guess which URLs to upgrade, you pull them from the webmaster tools. In Google Search Console, the Page indexing report's "Not found (404)" group is your candidate pool, every URL Google currently knows is missing. Work through it for patterns that are permanently dead, retired sections, discontinued product paths, an old URL structure left over from a migration, and route those patterns to 410 in bulk rather than one URL at a time.

Do the same on the Bing side, because the two engines do not always know about the same dead URLs. Bing Webmaster Tools surfaces crawl errors and offers a URL removal tool for anything you need gone immediately, and pairing that with IndexNow submissions keeps Bing's picture current. While you are in either tool, open the soft 404 reports and fix those first, because a soft 404 is actively lying to the crawler and costing you more than any 404-versus-410 choice ever will.

What this looks like on a real migration

The clearest payoff we see is during a platform migration, when a client moves to a new URL structure and leaves thousands of old URLs behind. Left alone, those URLs return 404, and for months the crawl stats show Googlebot and Bingbot spending a large share of their visits re-checking dead pages instead of discovering the new ones. The site effectively pays a crawl tax on its own history.

The fix we run is methodical. We map the old URLs into three buckets, the ones with a genuine new equivalent get a 301, the permanently dead patterns get a 410 served in bulk from the edge, and anything genuinely uncertain stays a 404. Then we push the 410 batch through IndexNow so Bing and the AI engines drop them quickly, and we watch the recrawl rate on those patterns fall in the crawl stats over the following weeks. The budget that was going to dead history shifts back to the live catalog, and the coverage reports stop being cluttered with URLs nobody will ever visit again. None of it is glamorous, and on a big migration it is some of the highest-leverage cleanup available.

The 404-versus-410 decision is tiny on any single page and compounds across thousands of them. Return the honest code, tell the engines once through IndexNow, and the crawlers stop spending your budget confirming what you already know.