Attempting to use transformers 4.11.0's experimental `low_cpu_mem_usage`
feature with GPT-2 models usually results in the output repeating a
token over and over or otherwise containing an incoherent response.
* `print()` and `warn()` now work correctly with `nil` arguments
* Typo: `gpt-neo-1.3M` has been corrected to `gpt-neo-1.3B`
* Regeneration is no longer triggered when writing to `keysecondary` of
a non-selective key
* Handle `genamt` changes in generation modifier properly
* Writing to `kobold.settings.numseqs` from a generation modifier no
longer affects
* Formatting options in `kobold.settings` have been fixed
* Added aliases for setting names
* Fix behaviour of editing story chunks from a generation modifier
* Warnings are now yellow instead of red
* kobold.logits is now the raw logits prior to being filtered, like
the documentation says, rather than after being filtered
* Some erroneous comments and error messages have been corrected
* These parts of the API have now been implemented properly:
* `compute_context()` methods
* `kobold.authorsnote`
* `kobold.restart_generation()`
Allow people to enter a prompt without generating anything by the AI. Combined with the always add prompt this is a very useful feature that allows people to write world information first, and then do a specific action. This mimics the behavior previously seen in AI Dungeon forks where it prompts for world information and then asks an action and can be particularly useful for people who want the prompt to always be part of the generation.
Automatically converts Huggingface cache models to full models on (down)load.
WARNING: Does wipe old cache/ dir inside the KoboldAI folder, make a backup before you run these models if you are bandwith constraint.
This seems to be related to the model config files, because only certain
models have this problem, and replacing ALL configuration files of a
"bad" model with those of a "good" model of the same type would fix the
problem.
Shouldn't be required anymore.