# Description This updates our proof-of-work middleware, NoLLaMas, to work on a more easily configurable algorithm (thank you f0x for bringing this to my attention!). Instead of requiring that a solution with pre-determined number of '0' chars be found, it now pre-computes a result with a pre-determined nonce value that it expects the client to iterate up-to. (though with some level of jitter applied, to prevent it being too-easily gamed). This allows the user to configure roughly how many hash-encode rounds they want their clients to have to complete. ## Checklist - [x] I/we have read the [GoToSocial contribution guidelines](https://codeberg.org/superseriousbusiness/gotosocial/src/branch/main/CONTRIBUTING.md). - [x] I/we have discussed the proposed changes already, either in an issue on the repository, or in the Matrix chat. - [x] I/we have not leveraged AI to create the proposed changes. - [x] I/we have performed a self-review of added code. - [x] I/we have written code that is legible and maintainable by others. - [x] I/we have commented the added code, particularly in hard-to-understand areas. - [x] I/we have made any necessary changes to documentation. - [ ] I/we have added tests that cover new code. - [x] I/we have run tests and they pass locally with the changes. - [x] I/we have run `go fmt ./...` and `golangci-lint run`. Reviewed-on: https://codeberg.org/superseriousbusiness/gotosocial/pulls/4186 Co-authored-by: kim <grufwub@gmail.com> Co-committed-by: kim <grufwub@gmail.com>
2.6 KiB
Scraper Deterrence
GoToSocial provides an optional proof-of-work based scraper and automated HTTP client deterrence that can be enabled on profile and status web views. The way it works is that it generates a unique but deterministic challenge for each incoming HTTP request based on client information and current time, that-is a hex encoded SHA256 hash. It then asks the client to find an integer addition to a portion of this that will generate an expected encoded hash result. This is served to the client as a minimal holding page with a single JavaScript worker that computes a solution to this.
The number of hash encode rounds the client is required to complete may be configured, where high values will take the client longer to find a solution and vice-versa. We also instill a certain amount of jitter to make it harder for scrapers to "game" the algorithm. If your challenges take too long to solve, you may deter users from accessing your web pages. And conversely, the longer it takes for a solution to be found, the more you'll be incurring costs for scrapers (and in some cases, causing their operation to time-out). That balance is up to you to configure, hence why this is an advanced feature.
Once a solution to this challenge has been provided, by refreshing the page with the solution in the query parameter, GoToSocial will verify this solution and on success will return the expected profile / status page with a cookie that provides challenge-less access to the instance for up-to the next hour.
The outcomes of this, (when enabled), is that it should make scraping of your instance's profile / status pages economically unviable for automated data gathering (e.g. by AI companies, search engines). The only negative, is that it places a requirement on JavaScript being enabled for people to access your profile / status web views.
This was heavily inspired by the great project that is anubis, but ultimately we determined we could implement it ourselves with only the features we require, minimal code, and more granularity with our existing authorization / authentication procedures.
The GoToSocial implementation of this scraper deterrence is still incredibly minimal, so if you're looking for more features or fine-grained control over your deterrence measures then by all means keep ours disabled and stand-up a service like anubis in front of your instance!
!!! warning This proof-of-work scraper deterrence does not protect user profile RSS feeds due to the extra complexity involved. If you rely on your RSS feed being exposed, this is one such case where anubis may be a better fit!