# Description
- tweaks the NoLLaMas proof-of-work algorithm to further granularity on time spent computing solutions
- standardizes GoToSocial cookie security directive setting in a CookiePolicy{} type
## Checklist
- [x] I/we have read the [GoToSocial contribution guidelines](https://codeberg.org/superseriousbusiness/gotosocial/src/branch/main/CONTRIBUTING.md).
- [x] I/we have discussed the proposed changes already, either in an issue on the repository, or in the Matrix chat.
- [x] I/we have not leveraged AI to create the proposed changes.
- [x] I/we have performed a self-review of added code.
- [x] I/we have written code that is legible and maintainable by others.
- [x] I/we have commented the added code, particularly in hard-to-understand areas.
- [ ] I/we have made any necessary changes to documentation.
- [ ] I/we have added tests that cover new code.
- [ ] I/we have run tests and they pass locally with the changes.
- [x] I/we have run `go fmt ./...` and `golangci-lint run`.
Co-authored-by: tobi <tobi.smethurst@protonmail.com>
Reviewed-on: https://codeberg.org/superseriousbusiness/gotosocial/pulls/4090
Co-authored-by: kim <grufwub@gmail.com>
Co-committed-by: kim <grufwub@gmail.com>
This adds a proof-of-work based scraper deterrence to GoToSocial's middleware stack on profile and status web pages. Heavily inspired by https://github.com/TecharoHQ/anubis, but massively stripped back for our own usecase.
Todo:
- ~~add configuration option so this is disabled by default~~
- ~~fix whatever weirdness is preventing this working with CSP (even in debug)~~
- ~~use our standard templating mechanism going through apiutil helper func~~
- ~~probably some absurdly small performance improvements to be made in pooling re-used hex encode / hash encode buffers~~ the web endpoints aren't as hot a path as API / ActivityPub, will leave as-is for now as it is already very minimal and well optimized
- ~~verify the cryptographic assumptions re: using a portion of token as challenge data~~ this isn't a serious application of cryptography, if it turns out to be a problem we'll fix it, but it definitely should not be easily possible to guess a SHA256 hash from the first 1/4 of it even if mathematically it might make it a bit easier
- ~~theme / make look nice??~~
- ~~add a spinner~~
- ~~add entry in example configuration~~
- ~~add documentation~~
Verification page originally based on https://github.com/LucienV1/powtect
Co-authored-by: tobi <tobi.smethurst@protonmail.com>
Reviewed-on: https://codeberg.org/superseriousbusiness/gotosocial/pulls/4043
Reviewed-by: tobi <tsmethurst@noreply.codeberg.org>
Co-authored-by: kim <grufwub@gmail.com>
Co-committed-by: kim <grufwub@gmail.com>
* [feature] Allow user to choose "gallery" style web layout
* find a bug and squish it up and all day long you'll have good luck
* just a sec
* [performance] reindex public timeline + tinker with query a bit
* fiddling
* should be good now
* last bit of finagling, i'm done now i prommy
* panic normally
* Add login button to index page which reiterates info about clients
* bit of CSS fiddling, move apps from front page to login info
* fix indentation
---------
Co-authored-by: tobi <tobi.smethurst@protonmail.com>
Allow instance admins to add custom CSS that will affect
every page of their instance.
This is done with a new CustomCSS instance setting that
works pretty much exactly like the Users CustomCSS property.
This custom CSS is then requested for every page load.
User styles/themes take precedence over this CSS.
Co-authored-by: tobi <tobi.smethurst@protonmail.com>
* [chore] Synchronise our robots.txt with upstream
* [feature] Add headers to escape AI crawlers
This adds 2 headers that a number of AI crawlers respect to signal that
content should not be included in their datasets.
* [feature/frontend] Respect `prefers-reduced-motion` for avatars, headers, and emojis
* go fmt
* fix tests
* use static version of instance thumbnail when appropriate
* use prefers-reduced-motion
* simplify account conversion a bit
* fix c&p error
This syncs our copy with the current state of the ai.robots.txt
repository. Upstream has tightened their scope to be AI-only, whereas
before it included a bunch of SEO and "web intelligence" marketing
stuff. I've kept those but moved them into their own section.
* [feature] Email change
* frontend stuff for changing email
* docs
* tests etc
* differentiate more clearly between local user+account and account
* populate user
This updates the robots.txt based on the list of the ai.robots.txt
repository. We can look at automating that at some point.
It's worth pointing out that some robots, namely the ones by Bytedance,
are known to ignore robots.txt entirely.
* update settings panels, add pending overview + approve/deny functions
* add admin accounts get, approve, reject
* send approved/rejected emails
* use signup URL
* docs!
* email
* swagger
* web linting
* fix email tests
* wee lil fixerinos
* use new paging logic for GetAccounts() series of admin endpoints, small changes to query building
* shuffle useAccountIDIn check *before* adding to query
* fix parse from toot react error
* use `netip.Addr`
* put valid slices in globals
* optimistic updates for account state
---------
Co-authored-by: kim <grufwub@gmail.com>
* [feature] User sign-up form and admin notifs
* add chosen + filtered languages to migration
* remove stray comment
* chosen languages schmosen schmanguages
* proper error on local account missing
* [feature] User-selectable preset themes
* docs, more theme stuff
* lint, tests
* fix css name
* correct some little issues
* add another theme
* fix poll background
* okay last theme i swear
* make retrieval of apimodel themes more conventional
* preallocate stylesheet slices
* [chore] Refactor HTML templates and CSS
* eslint
* ignore "Local"
* rss tests
* fiddle with OG just a tiny bit
* dick around with polls a bit more so SR stops saying "clickable"
* remove break
* oh lord
* don't lazy load avatar
* fix ogmeta tests
* clean up some cruft
* catch remaining calls to c.HTML
* fix error rendering + stack overflow in tag
* allow templating attributes
* fix indent
* set aria-hidden on status complementary content, since it's already present in the label anyway
* tidy up templating calls a little
* try to make styling a bit more consistent + readable
* fix up some remaining CSS issues
* fix up reports
* update go text, include text/display
* [feature] Set instance langs, show post lang on frontend
* go fmt
* WebGet
* set language for whole article, don't use FA icon
* mention instance languages + other optional config vars
* little tweak
* put languages in config properly
* warn log language parse
* change some naming around
* tidy up validate a bit
* lint
* rename LanguageTmpl in template
* deinterface router, start messing about with deadlines
* weeeee
* thanks linter (thinter)
* write Connection: close when timing out requests
* update wording
* don't replace req
* don't bother with fancy Cause functions (I'll use them one day...)
* [feature] Block Google Bard/AI crawlers
* [feature] Block the other OpenAI crawler
* [feature] Block Common Crawl crawler
This is used in research, but also gleefully advertises itself as the
training source used in all LLMs and GPT-3.
Fixes: #2240
* [feature] Block Omgilikebot
Used by some shady big web data engine company.
* [feature] Block Meta's language model crawler
* [feature] Block well-known.dev crawler
* add automatic cache max size generation based on ratios of a singular fixed memory target
Signed-off-by: kim <grufwub@gmail.com>
* remove now-unused cache max-size config variables
Signed-off-by: kim <grufwub@gmail.com>
* slight ratio tweak
Signed-off-by: kim <grufwub@gmail.com>
* remove unused visibility config var
Signed-off-by: kim <grufwub@gmail.com>
* add secret little ratio config trick
Signed-off-by: kim <grufwub@gmail.com>
* fixed a word
Signed-off-by: kim <grufwub@gmail.com>
* update cache library to remove use of TTL in result caches + slice cache
Signed-off-by: kim <grufwub@gmail.com>
* update other cache usages to use correct interface
Signed-off-by: kim <grufwub@gmail.com>
* update example config to explain the cache memory target
Signed-off-by: kim <grufwub@gmail.com>
* update env parsing test with new config values
Signed-off-by: kim <grufwub@gmail.com>
* do some ratio twiddling
Signed-off-by: kim <grufwub@gmail.com>
* add missing header
* update envparsing with latest defaults
Signed-off-by: kim <grufwub@gmail.com>
* update size calculations to take into account result cache, simple cache and extra map overheads
Signed-off-by: kim <grufwub@gmail.com>
* tweak the ratios some more
Signed-off-by: kim <grufwub@gmail.com>
* more nan rampaging
Signed-off-by: kim <grufwub@gmail.com>
* fix envparsing script
Signed-off-by: kim <grufwub@gmail.com>
* update cache library, add sweep function to keep caches trim
Signed-off-by: kim <grufwub@gmail.com>
* sweep caches once a minute
Signed-off-by: kim <grufwub@gmail.com>
* add a regular job to sweep caches and keep under 80% utilisation
Signed-off-by: kim <grufwub@gmail.com>
* remove dead code
Signed-off-by: kim <grufwub@gmail.com>
* add new size library used to libraries section of readme
Signed-off-by: kim <grufwub@gmail.com>
* add better explanations for the mem-ratio numbers
Signed-off-by: kim <grufwub@gmail.com>
* update go-cache
Signed-off-by: kim <grufwub@gmail.com>
* library version bump
Signed-off-by: kim <grufwub@gmail.com>
* update cache.result{} size model estimation
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* update go-fed
* do the things
* remove unused columns from tags
* update to latest lingo from main
* further tag shenanigans
* serve stub page at tag endpoint
* we did it lads
* tests, oh tests, ohhh tests, oh tests (doo doo doo doo)
* swagger docs
* document hashtag usage + federation
* instanceGet
* don't bother parsing tag href
* rename whereStartsWith -> whereStartsLike
* remove GetOrCreateTag
* dont cache status tag timelineability
* catch SQLITE_BUSY errors, wrap bun.DB to use our own busy retrier, remove unnecessary db.Error type
Signed-off-by: kim <grufwub@gmail.com>
* remove dead code
Signed-off-by: kim <grufwub@gmail.com>
* remove more dead code, add missing error arguments
Signed-off-by: kim <grufwub@gmail.com>
* update sqlite to use maxOpenConns()
Signed-off-by: kim <grufwub@gmail.com>
* add uncommitted changes
Signed-off-by: kim <grufwub@gmail.com>
* use direct calls-through for the ConnIface to make sure we don't double query hook
Signed-off-by: kim <grufwub@gmail.com>
* expose underlying bun.DB better
Signed-off-by: kim <grufwub@gmail.com>
* retry on the correct busy error
Signed-off-by: kim <grufwub@gmail.com>
* use longer possible maxRetries for db retry-backoff
Signed-off-by: kim <grufwub@gmail.com>
* remove the note regarding max-open-conns only applying to postgres
Signed-off-by: kim <grufwub@gmail.com>
* improved code commenting
Signed-off-by: kim <grufwub@gmail.com>
* remove unnecessary infof call (just use info)
Signed-off-by: kim <grufwub@gmail.com>
* rename DBConn to WrappedDB to better follow sql package name conventions
Signed-off-by: kim <grufwub@gmail.com>
* update test error string checks
Signed-off-by: kim <grufwub@gmail.com>
* shush linter
Signed-off-by: kim <grufwub@gmail.com>
* update backoff logic to be more transparent
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* [bugfix] Set Vary header correctly on cache-control
* Prefer activitypub types on AP endpoints
* use immutable on file server, vary by range
* vary auth on Accept
* [bugfix] Tidy up rss feed serving; don't error on empty feed
* fall back to account creation time as rss feed update time
* return feed early when account has no eligible statuses