home tags events about login
one honk maybe more

benjojo posted 15 May 2023 16:02 +0000

The volume of web crawling is getting worse and worse. I've previously suspected that Bing's huge uptick is actually just getting data for LLM/OpenAI, but now "amazonbot" is getting in the action too!

Unsure if they are secretly doing LLM training on the site, but they are ramping up in speed!

They are using the user agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)

Looking at the support page, they are "real" amazonbots crawling me, But once again, the speed is getting kind of offensive for no obvious gain.

The cherry on top is:

AmazonBot does not support the crawl-delay directive in robots.txt and robots meta tags on HTML pages such as “nofollow” and "noindex".

Well cool! bgp.tools no longer supports AmazonBot then!