In 1.16 we had significantly faster loading speeds because we did not do as much memory conservation, its time to give users the choice. If you want the original faster behavior and have the memory run KoboldAI as usual. Otherwise run play-lowmem.bat or aiserver.py with --lowmem. For colab this is still the default behavior to avoid breaking models that would otherwise load fine.
Makes the Colab mode also automatically activate the Quiet mode to improve privacy. We should no longer need this in the colab console thanks to the redo feature. Need something different for testing? Use --remote instead.
Change the proposed --share to --unblock to make it more apparent what this feature does. The feature unblocks the port from external access, but does not add remote play support. For remote play support without a proxy service I have added --host .
Default settings for the new repetition penalty settings (Better suggestions very much welcome since broader community testing has not been done).
Updated the Readme with the link to the offline installer.
Ran into issues with other modes like chatmode and adventure, moved it further down the pipeline and converting </s> back to \n before processing additional formatting.
Still has an issue with the html formatting not working, but at least the AI works now.
On second thought, it is probably better to not save this. Advanced users can add this themselves and that way newer versions of the model can override it if redownloaded.
Allows model creators to customize the welcome message using Markdown and Limited HTML
Existing United users need to run install_requirements..bat again, you can leave the existing dependencies intact.
Adds a Nobreakmodel var that allows Breakmodel to be turned off. This can be done trough commandline or a model config (In case Neo is used by the models config without it being a true Neo model that is compatible with breakmodel).
In addition I removed the args.colab check for breakmodel support and instead make args.colab activate nobreakmodel. And I have added a new check so that breakmodel is not even attempted if you do not specify the layers but do launch a model from the command line.
Changed the model VRAM requirements to what you'd need to comfortably run the model rather than barely (Like with the manual). Will probably revise this in a later commit.
More importantly, it now supports models that use </s> which will be required to support XGLM and Fairseq models.