To prevent confusion with users who have not used KoboldAI for a while, or who are following old tutorials I have added a disclaimer that informs people that most Colab links should not be used with this feature and instead opened in the browser.
The `dynamic_processor_wrap` makes it so that the repetition penalty is
read directly from `vars`, but this only works if the initial repetition
sent to `generator` is not equal to 1. So we are now forcing the initial
repetition penalty to be something other than 1.
This commit exposes antemplates to the model config, this lets authors specify what kind of authors notes template they would like to use for their model. Users can still change it if they desire.
Blank lines appear often in chatmode so it is best played with blank line removal turned on, this is now forced. Its not compatible with Adventure mode, so they now turn each other off.
Added more models in the menu, all the popular community models are now easily accessible. I also re-ordered the menu from large to small to have it make a bit more sense.
* Error messages are now shown when memory, author's note, etc. exceeds
budget by itself
* Formatting options no longer break if there are empty chunks in the
story (although there shouldn't be any in the first place)
* Number of generated tokens is now kept track of from Python
* Removed `vars.model_orig`
* `requirex()` in bridge.lua now maintains a separate module cache for each
userscript instead of using the same cache for all userscripts
* `vars.lua_deleted` and `vars.lua_edited` are now erased right before running
the input modifiers instead of right before each time the generation modifiers
are run
The Initial commit for Chat Mode, the nickname part of the UI is missing other than that it should be fully functional. To use Chat Mode effectively you first input a small dialogue (Can be around 6 lines 3 of your own inputs and 3 of the character) formatted as Name : it will then automate the actions needed to chat properly. During this mode single line mode is forced on, and Trim Incomplete Sentences is forced off.
Futureproofing for future tokenizers, for now this is not needed since everything uses GPT2. But when that changes we want to be prepared. Not all models have a proper tokenizer config, so if we can't find one we fall back to GPT2.
First batch, will be more, we will also need to update the other VRAM display's with the changes that have happened. Will happen depending on how the 8-bit stuff goes.