There have been a lot of reports from newer users who experience AI breakdown because not all models properly handle 2048 max tokens. 1024 is the only value that all models support and was the original value KoboldAI used. This commit reverts the decision to increase this to 2048, any existing configurations are not effected. Users who wish to increase the max tokens can do so themselves. Most models handle up to 1900 well (The GPT2 models are excluded), for many you can go all the way. (It is currently not yet known why some finetunes cause a decrease in maxtoken support,
In addition this commit contains a request for more consistent slider behavior, allowing the sliders to be changed at 0.01 intervals instead of some sliders being capped to 0.05.
More settings reordering so similar settings are on the same rows now that we have more settings for the repetition penalty. Amount to generate is now top left so some muscle memory may be lost with the temp. But the settings that control AI randomness are on the same row now, and repetition related settings are next to each other as well.
A user expressed positive feedback when trying higher than 2 repetition penalty on some models, lets allow people the freedom to do so. If there is a demonstrable benefit to running higher than 3 I am open to raising it again.
The Initial commit for Chat Mode, the nickname part of the UI is missing other than that it should be fully functional. To use Chat Mode effectively you first input a small dialogue (Can be around 6 lines 3 of your own inputs and 3 of the character) formatted as Name : it will then automate the actions needed to chat properly. During this mode single line mode is forced on, and Trim Incomplete Sentences is forced off.
Allow people to enter a prompt without generating anything by the AI. Combined with the always add prompt this is a very useful feature that allows people to write world information first, and then do a specific action. This mimics the behavior previously seen in AI Dungeon forks where it prompts for world information and then asks an action and can be particularly useful for people who want the prompt to always be part of the generation.
Adds Single Line mode, optimized for things like chatbot testing and other cases where you want to have control over what happens after a paragraph.
This can also be used as a foundation for a chatbot optimized interface mode.
It works smoothly on the TPU colab, so lets allow it. People should not turn this all the way up unless they got the hardware, but we want to allow this for those that do.