Daenney 0cce2c0838
[feature] Block a bunch of "AI" crawlers (#2239)
* [feature] Block Google Bard/AI crawlers

* [feature] Block the other OpenAI crawler

* [feature] Block Common Crawl crawler

This is used in research, but also gleefully advertises itself as the
training source used in all LLMs and GPT-3.

Fixes: #2240

* [feature] Block Omgilikebot

Used by some shady big web data engine company.

* [feature] Block Meta's language model crawler

* [feature] Block well-known.dev crawler
2023-09-30 20:44:57 +01:00
..