[feature] proof of work scraper deterrence (#4043)

This adds a proof-of-work based scraper deterrence to GoToSocial's middleware stack on profile and status web pages. Heavily inspired by https://github.com/TecharoHQ/anubis, but massively stripped back for our own usecase.

Todo:
- ~~add configuration option so this is disabled by default~~
- ~~fix whatever weirdness is preventing this working with CSP (even in debug)~~
- ~~use our standard templating mechanism going through apiutil helper func~~
- ~~probably some absurdly small performance improvements to be made in pooling re-used hex encode / hash encode buffers~~ the web endpoints aren't as hot a path as API / ActivityPub, will leave as-is for now as it is already very minimal and well optimized
- ~~verify the cryptographic assumptions re: using a portion of token as challenge data~~ this isn't a serious application of cryptography, if it turns out to be a problem we'll fix it, but it definitely should not be easily possible to guess a SHA256 hash from the first 1/4 of it even if mathematically it might make it a bit easier
- ~~theme / make look nice??~~
- ~~add a spinner~~
- ~~add entry in example configuration~~
- ~~add documentation~~

Verification page originally based on https://github.com/LucienV1/powtect

Co-authored-by: tobi <tobi.smethurst@protonmail.com>
Reviewed-on: https://codeberg.org/superseriousbusiness/gotosocial/pulls/4043
Reviewed-by: tobi <tsmethurst@noreply.codeberg.org>
Co-authored-by: kim <grufwub@gmail.com>
Co-committed-by: kim <grufwub@gmail.com>
This commit is contained in:
kim
2025-04-28 20:12:27 +00:00
committed by kim
parent 2b82fa7481
commit d8c4d9fc5a
16 changed files with 759 additions and 19 deletions

View File

@@ -0,0 +1,43 @@
{{- /*
// GoToSocial
// Copyright (C) GoToSocial Authors admin@gotosocial.org
// SPDX-License-Identifier: AGPL-3.0-or-later
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU Affero General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Affero General Public License for more details.
//
// You should have received a copy of the GNU Affero General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.
*/ -}}
{{- with . }}
<main>
<section class="nollamas"
data-nollamas-challenge="{{ .challenge }}"
data-nollamas-difficulty="{{ .difficulty }}"
>
<h1>Checking you're not a creepy crawler...</h1>
<noscript>
<p>
The page you're visiting is guarded from "ai" scrapers
and other crawlers by a proof-of-work challenge.
</p>
<p>
Unfortunately, this means that Javascript is required.
To see the page, <b>please enable Javascript and try again</b>.
</p>
<aside>
Once your browser has completed the challenge, you can turn
Javascript off again if you like. Revalidation is done once per hour.
</aside>
</noscript>
</section>
</main>
{{- end }}