Generate(), being async, now returns a promise-within-a-promise.
If called with `let p = await Generate(...)`, it'll wait for generation
to *start*. If you then `await p`, you'll wait for generation to
*finish*. This makes it much easier to tell exactly when generation's
done. generateGroupWrapper has been similarly modified.
I searched for all users of tokenizers.API, but missed that the menu
converts the numerical select values directly to enum values. I've used
the special tokenizer value 98 to represent "the tokenizer API for
whichever backend we're currently using".
Sorting with a random comparator doesn't actually shuffle an array.
Depending on the sorting algorithm used, there will be a bias to the
shuffle (see https://bost.ocks.org/mike/shuffle/compare.html).
If you open that link in Firefox, the bias will be especially bad.
Instead of implementing "random" character sort using a random sort
comparator, use the shuffle function instead.
This means we don't have to pass the "padding" parameter into every
function so they can add the padding themselves--we can do it in just
one place instead.
Store the URLs for each tokenizer's action in one place at the top of
the file, instead of in a bunch of switch-cases. The URLs for the
textgen and Kobold APIs don't change and hence don't need to be
function arguments.
The endpoint was one big if/else statement that did two entirely
different things depending on the value of main_api. It makes more sense
for those to be two separate endpoints.
They function differently and have different logic and API parameters,
so it makes sense to count them as two different APIs. Kobold's API
doesn't return tokens, so it can only be used to count them.
There's still a lot of duplicate code which I will clean up in the
following commits.
We'll be making a distinction between tokenizing *on* the server itself,
and tokenizing via the server having the AI service do it. It makes more
sense to use the term "remote" for the latter.
Apparently the ignoreBOM option actually means "include the BOM". I've
added a test for this in my own repository, and will also be submitting
a pull request to MDN to clarify this in their documentation.
This was a workaround for older versions of Slaude that implemented SSE
improperly. This was fixed in Slaude 7 months ago, so the workaround can
be removed.
Create one server-sent events stream class which implements the entire
spec (different line endings, chunking, etc) and use it in all the
streaming generators.