I wonder… Google scrapes for indexing and for AI, right? I wonder if they will e...

mrweasel · 2025-07-02T15:49:58 1751471398

Very few people seems to be complaining that Google crashes their sites. Google also publish their crawlers IP ranges, but you really don't need to rate-limit Google, they know how to back off and not overload sites.

Symbiote · 2025-07-02T22:29:57 1751495397

In theory — in practise I've had to limit Google on two large sites at work. I currently have them limited to 10/s for non-cached requests.

progmetaldev · 2025-07-03T00:49:53 1751503793

Curious if the content on those sites might have high value to Google? Such as if they have data that is new or unavailable elsewhere, or if they're just standard sites, and you've just been unlucky?

I have had odd bot behavior from some major crawlers, but never from Google. I wonder if there is a correlation to usefulness of content, or if certain sites get stuck in a software bug (or some other strange behavior).

Symbiote · 2025-07-03T15:47:11 1751557631

Google do value the sites, they have data unavailable elsewhere. At some point we had an automated message saying the site had too many pages and would no longer be indexed, then a human message saying that was a mistake, and our site was an exception to that rule.

But as with any contact with these large companies, our contact eventually disappeared.

giancarlostoro · 2025-07-02T15:23:00 1751469780

"Embrace, Extend, Extinguish" Google's mantra. And yes, I know about Microsoft's history with that phrase ;) But Google has done this with email, browsers (Google has web apps that run fine on Firefox but request you use Chrome), Linux (Android), and I'm sure there's others I am forgetting about.

So yeah, I too could see them doing this.