Right now all it does is handle returning if there's already a message
being generated, but I'll extend it with more logic that I want to move
out of Generate().
* Move StreamingProcessor constructor to the top
Typical code style is to declare the constructor at the top of the class
definition.
* Remove removePrefix
cleanupMessage does this already.
* Make message_already_generated local
We can pass it into StreamingProcessor so it doesn't have to be a global
variable.
* Consolidate setting isStopped and abort signal
Various places were doing some combination of setting isStopped, calling
abort on the streaming processor's abort controller, and calling
onStopStreaming. Let's consolidate all that functionality into
onStopStreaming/onErrorStreaming.
* More cleanly separate streaming/nonstreaming paths
* Replace promise with async function w/ handlers
By using onSuccess and onError as promise handlers, we can use normal
control flow and don't need to remember to use try/catch blocks or call
onSuccess every time.
* Remove runGenerate
Placing the rest of the code in a separate function doesn't really do
anything for its structure.
* Move StreamingProcessor() into streaming code path
* Fix return from circuit breaker
* Fix non-streaming chat completion request
* Fix Horde generation and quiet unblocking
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
We were using getDeviceInfo to check whether we were on a desktop or a
mobile device. This can be done more simply with isMobile, which means
we can stop exporting getDeviceInfo.
I'd like to migrate over to using "textgen" to mean text-generation APIs
in general, so I've renamed the /textgenerationwebui/* endpoints to
/backends/text-completions/*.
Instead of "version" and "koboldVersion", have "koboldUnitedVersion" and
"koboldCppVersion", the latter of which is null if we're not connected
to KoboldCpp.
We now track the loop counter as a parameter of Generate that we
decrement with every recursive call, rather than a global variable,
and it *should* now work with quiet prompt generation.
Generate(), being async, now returns a promise-within-a-promise.
If called with `let p = await Generate(...)`, it'll wait for generation
to *start*. If you then `await p`, you'll wait for generation to
*finish*. This makes it much easier to tell exactly when generation's
done. generateGroupWrapper has been similarly modified.
Since we already have a template cache, it makes sense to store the
templates in it *after* compiling them, to avoid the overhead of
re-compiling them every time we call renderTemplate.
I've also changed the cache from an object to a Map--it's more
semantically correct, and avoids weird edge cases like a template named
"hasOwnProperty" or some other function that exists as an object
property.
Have all the get(...)Status and event handler registrations in the same
areas, rather than having the NovelAI ones far away. I want to
eventually move all the API-specific stuff into separate modules, but
this will make things cleaner for the time being.
This adds a bit of duplicate code for the time being, but ultimately
makes the code less confusing because we only need to include the bits
that are relevant to the specific API in each function. We can also
remove API parameters that are useless depending on the endpoint.
Remove the StreamingProcessor.hook method and use a try-catch block to
await the generator promise and set the generator, handling errors with
onError if it fails.