Commit Graph

7 Commits

Author SHA1 Message Date
Daenney 0cce2c0838
[feature] Block a bunch of "AI" crawlers (#2239)
* [feature] Block Google Bard/AI crawlers

* [feature] Block the other OpenAI crawler

* [feature] Block Common Crawl crawler

This is used in research, but also gleefully advertises itself as the
training source used in all LLMs and GPT-3.

Fixes: #2240

* [feature] Block Omgilikebot

Used by some shady big web data engine company.

* [feature] Block Meta's language model crawler

* [feature] Block well-known.dev crawler
2023-09-30 20:44:57 +01:00
tobi 4b05dcde43
[chore] Update robots.txt, give chatgpt the middle finger (#2085) 2023-08-08 13:16:34 +02:00
Daenney 5e2bf0bdca
[chore] Improve copyright header handling (#1608)
* [chore] Remove years from all license headers

Years or year ranges aren't required in license headers. Many projects
have removed them in recent years and it avoids a bit of yearly toil.

In many cases our copyright claim was also a bit dodgy since we added
the 2021-2023 header to files created after 2021 but you can't claim
copyright into the past that way.

* [chore] Add license header check

This ensures a license header is always added to any new file. This
avoids maintainers/reviewers needing to remember to check for and ask
for it in case a contribution doesn't include it.

* [chore] Add missing license headers

* [chore] Further updates to license header

* Use the more common // indentend comment format
* Remove the hack we had for the linter now that we use the // format
* Add SPDX license identifier
2023-03-12 16:00:57 +01:00
f0x52 17eecfb6d9
[feature] Public list of suspended domains (#1362)
* basic rendered domain blocklist (unauthenticated!)

* style basic domain block list

* better formatting for domain blocklist

* add opt-in config option for showing suspended domains

* format/linter

* re-use InstancePeersGet for web-accessible domain blocklist

* reword explanation, border styling

* always attach blocklist handler, update error message

* domain blocklist error message grammar
2023-01-25 18:06:41 +01:00
tobi 0dbe6c514f
[chore] Update/add license headers for 2023 (#1304) 2023-01-05 12:43:00 +01:00
tobi 941893a774
[chore] The Big Middleware and API Refactor (tm) (#1250)
* interim commit: start refactoring middlewares into package under router

* another interim commit, this is becoming a big job

* another fucking massive interim commit

* refactor bookmarks to new style

* ambassador, wiz zeze commits you are spoiling uz

* she compiles, we're getting there

* we're just normal men; we're just innocent men

* apiutil

* whoopsie

* i'm glad noone reads commit msgs haha :blob_sweat:

* use that weirdo go-bytesize library for maxMultipartMemory

* fix media module paths
2023-01-02 12:10:50 +00:00
tobi dd83ad053c
[feature] Add `meta robots` tag; allow robots to index profile card if user is Discoverable (#842)
* rework robots.txt response

* don't let robots snippet from statuses/threads

* allow robots to index if user is Discoverable

* add license text
2022-09-29 12:03:17 +02:00