readthedocs.org: Requests by Google(bot) will be answered with 403 Forbidden by Cloudflare

Details

We registered a recent drop in traffic and checked the Search Console for pointers, apparently Google is no longer allowed to fetch the various sitemaps and content of our RTD repositories. The requests result in 403 errors and it says “Couldn’t fetch” in the overview and this in the “details” (which is not very conclusive, especially if opening the sitemap works on all our machines):

Pages: grafik

Sitemaps: grafik

Only when you add -A "googlebot" or its derivatives in the curl request, it also throws a 403 error (might be unrelated due to the way googlebots and the corresponding ip addresses work, but I thought I would mention it).

Like so:

$ curl -A "googlebot" --head https://crate.io/docs/crate/reference/en/latest/sitemap.xml
HTTP/2 403

Can anyone confirm a similar issue on their sitemaps?

Expected Result

Google fetching our sitemaps as they used to.

Actual Result

Fetching blocked by 403 errors.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (9 by maintainers)

Commits related to this issue

Most upvoted comments

All right, thank you!

Apparently, the curl request only was succeeding on a page which would yield a 302 redirect. On a regular page, we still get

$ curl --user-agent "googlebot" --head https://crate.io/docs/crate/reference/en/4.5/sitemap.xml
HTTP/2 403

It appears those are getting blocked by 100201 - Anomaly:Header:User-Agent - Fake Google Bot – which seems right 😃

Yeah, it definitively feels right, now that we know about the origin of the “problem” - it’s actually a feature and led us to wrong conclusions while trying to reproduce the issue. So, let us have a review on our Nginx settings at crate.io together with @WalBeh, we will come back here and report afterwards. Thanks again for taking the time!

I think the other useful next step here would be asking CF support about the actual Googlebot requests getting blocked – that is a real issue. I’ll follow up with them.

Right. Thank you so much!