Recent optimizations caused the CPU version to load in an incompatible format, now we convert it back to the correct format after loading it efficiently first.
As requested by VE_FORBRYDERNE (Possibly implemented it on to many places, needs testing but since the other one is already broken I am committing it first so I can more easily test)
If the beginning of the comment is at the beginning of a line AND the
end of a comment is at the end of a line, an additional newline will now
be ignored so that the AI doesn't see a blank line where the comment
was.
For example, consider the following message:
```
Hello
<|This is
a comment|>
World
```
The AI will now see this:
```
Hello
World
```
instead of this:
```
Hello
World
```
Multiple things have changed, for now models default to half mode even on the official transformers to make sure its as efficient on the GPU as finetune's. GPU selection is streamlined and cache files are now stored inside the KoboldAI folder (for the most part). A new command line parameter to force the models to run at their full size still needs to be added for the few users that would want a quality bump at the cost of ram.
Changes the line-endings to the Unix format and sets KoboldAI to launch with Python3 if executed directly.
(cherry picked from commit 5b0977ceb6807c0f80ce6717891ef5e23c8eeb77)
The only changes are a small addition to the breakmodel section where GPU0 is automatically chosen if the CLI options are used without specifying breakmodel. Lineendings have been changed to Linux formatting for compatibility reasons.
Its made for Python3, so we assume python3 is installed in its usual location. If it isn't you can always run it yourself with whatever command you used prior to this change.
This prevents the "thinking" animation from appearing on top of the
submit button under certain circumstances:
* When someone connects to the KoboldAI server while the model is
generating (occurs after generation finishes)
* Occasionally, the browser may suddenly disconnect and reconnect from
Flask-SocketIO during generation, which causes the same problem