Seperate file so people can easily go back to the legacy implementation based on finetune (Recommended until Huggingface's compatibility is improved) . You can install and use both.
When Firefox 93.0 was released, they broke the ability to edit text
across multiple chunks or across multiple paragraphs. If you tried,
nothing would happen.
Also, we are no longer using Mutation Observers to detect when a chunk
is modified. We are now using the beforeinput event.
Multiple things have changed, for now models default to half mode even on the official transformers to make sure its as efficient on the GPU as finetune's. GPU selection is streamlined and cache files are now stored inside the KoboldAI folder (for the most part). A new command line parameter to force the models to run at their full size still needs to be added for the few users that would want a quality bump at the cost of ram.
Allows anyone to easily create a ROCm compatible conda environment. Currently set to the newer transformers, you can edit the github link if you want the finetune one.
Changes the line-endings to the Unix format and sets KoboldAI to launch with Python3 if executed directly.
(cherry picked from commit 5b0977ceb6807c0f80ce6717891ef5e23c8eeb77)
The only changes are a small addition to the breakmodel section where GPU0 is automatically chosen if the CLI options are used without specifying breakmodel. Lineendings have been changed to Linux formatting for compatibility reasons.
Its made for Python3, so we assume python3 is installed in its usual location. If it isn't you can always run it yourself with whatever command you used prior to this change.
This prevents the "thinking" animation from appearing on top of the
submit button under certain circumstances:
* When someone connects to the KoboldAI server while the model is
generating (occurs after generation finishes)
* Occasionally, the browser may suddenly disconnect and reconnect from
Flask-SocketIO during generation, which causes the same problem