Changes the line-endings to the Unix format and sets KoboldAI to launch with Python3 if executed directly.
(cherry picked from commit 5b0977ceb6807c0f80ce6717891ef5e23c8eeb77)
The only changes are a small addition to the breakmodel section where GPU0 is automatically chosen if the CLI options are used without specifying breakmodel. Lineendings have been changed to Linux formatting for compatibility reasons.
Its made for Python3, so we assume python3 is installed in its usual location. If it isn't you can always run it yourself with whatever command you used prior to this change.
This prevents the "thinking" animation from appearing on top of the
submit button under certain circumstances:
* When someone connects to the KoboldAI server while the model is
generating (occurs after generation finishes)
* Occasionally, the browser may suddenly disconnect and reconnect from
Flask-SocketIO during generation, which causes the same problem
Apparently transformers maintains an internal reference to input_ids
(to use for repetition penalty) so we have to clamp the internal
version, too, because otherwise transformers will throw an out-of-bounds
error upon attempting to access token IDs that are not in the
vocabulary.
Adds Single Line mode, optimized for things like chatbot testing and other cases where you want to have control over what happens after a paragraph.
This can also be used as a foundation for a chatbot optimized interface mode.
breakmodel_layers and layers is confusing, changed the new method to breakmodel_gpulayers. The old one should no longer be used by people, but since it works in reverse we leave it in so scripts don't break.
Feedback from users is that its better to not always submit the prompt, this is consistent with the randomly generated stories. You can always toggle it on if you need this for coherency. This change does not override existing user settings.