Commit Graph

227 Commits

Author SHA1 Message Date
Gnome Ann
683bcb824f Merge branch 'united' into world-info 2021-12-05 13:06:32 -05:00
Gnome Ann
6d8517e224 Fix some minor coding errors 2021-12-05 11:39:59 -05:00
Gnome Ann
150ce033c9 TPU backend no longer needs to recompile after changing softprompt 2021-12-05 02:49:15 -05:00
Gnome Ann
b99ac92a52 WI folders and WI drag-and-drop 2021-12-04 23:59:28 -05:00
henk717
44d8068bab Ngrok Support
Not recommended for home users due to DDoS risks, but might make Colab tunnels more reliable.
2021-11-29 18:11:14 +01:00
Gnome Ann
9f51c42dd4 Allow bad words filter to ban <|endoftext|> token
The official transformers bad words filter doesn't allow this by
default. Finetune's version does allow this by default, however.
2021-11-27 11:42:06 -05:00
henk717
2bc93ba37a
Whitelist 6B in breakmodel
Now that we properly support it, allow the menu option to use breakmodel
2021-11-27 10:09:54 +01:00
henk717
b56ee07ffa
Fix for CPU mode
Recent optimizations caused the CPU version to load in an incompatible format, now we convert it back to the correct format after loading it efficiently first.
2021-11-27 05:34:29 +01:00
Gnome Ann
e5e2fb088a Remember to actually import GPTJModel 2021-11-26 12:38:52 -05:00
Gnome Ann
871ed65570 Remove an unnecessary **maybe_low_cpu_mem_usage() 2021-11-26 11:42:04 -05:00
Gnome Ann
a93a76eb01 Load model directly in fp16 if using GPU or breakmodel 2021-11-26 10:55:52 -05:00
Gnome Ann
32e1d4a7a8 Enable low_cpu_mem_usage 2021-11-25 18:09:25 -05:00
Gnome Ann
25c9be5d02 Breakmodel support for GPTJModel 2021-11-25 18:09:16 -05:00
Gnome Ann
f8bcc3411b In breakmodel mode, move layers to GPU as soon as model loads
Rather than during the first generation.
2021-11-25 11:44:41 -05:00
Gnome Ann
cbb6efb656 Move TFS warper code into aiserver.py 2021-11-24 13:36:54 -05:00
henk717
a2c82bbcc8 num_layers fixes
As requested by VE_FORBRYDERNE (Possibly implemented it on to many places, needs testing but since the other one is already broken I am committing it first so I can more easily test)
2021-11-24 03:44:11 +01:00
henk717
d7a2424d2d No Half on CPU
Should fix CPU executions
2021-11-23 17:14:01 +01:00
Gnome Ann
be0881a8d0 Use model.config.n_layer if model.config.num_layers doesn't exist 2021-11-23 10:09:24 -05:00
Gnome Ann
9b8bcb5516 Always convert soft prompt to float32 if using TPU backend
TPUs do not support float16. Attempting to use a float16 soft prompt
throws an error.
2021-11-21 18:22:10 -05:00
Gnome Ann
e068aa9f26 Add soft prompt support to TPU backend 2021-11-21 18:08:04 -05:00
Gnome Ann
df2768b745 Simplify the comment regex 2021-11-21 01:09:19 -05:00
Gnome Ann
7ab0d96b8a Change the comment regex again to use fixed-length lookbehind 2021-11-21 01:06:31 -05:00
Gnome Ann
624cfbd5a4 Use a smarter regex for comments
If the beginning of the comment is at the beginning of a line AND the
end of a comment is at the end of a line, an additional newline will now
be ignored so that the AI doesn't see a blank line where the comment
was.

For example, consider the following message:
```
Hello
<|This is
  a comment|>
World
```

The AI will now see this:
```
Hello
World
```

instead of this:
```
Hello

World
```
2021-11-21 00:42:57 -05:00
Gnome Ann
a51f88aeb3 Also apply comment formatting to prompt in refresh_story() 2021-11-21 00:26:45 -05:00
Gnome Ann
1968be82bb Remove comments from prompt in WI processor and InferKit mode 2021-11-20 22:23:06 -05:00
Gnome Ann
8ce8e621ce Fix typo (one of the comregex_ui should be comregex_ai) 2021-11-20 22:19:12 -05:00
Gnome Ann
c2ed31de28 Add syntax for comments <|...|> 2021-11-20 01:27:57 -05:00
Gnome Ann
a65c4de840 Integrate TPU backend
This commit puts the TPU backend code directly in to the KoboldAI code
to make it easier to modify.
2021-11-19 18:06:57 -05:00
henk717
4a678deaa5
Merge branch 'KoboldAI:main' into united 2021-11-18 06:51:44 +01:00
henk717
b25c54cf91 Polishing and Optimizations
Multiple things have changed, for now models default to half mode even on the official transformers to make sure its as efficient on the GPU as finetune's. GPU selection is streamlined and cache files are now stored inside the KoboldAI folder (for the most part). A new command line parameter to force the models to run at their full size still needs to be added for the few users that would want a quality bump at the cost of ram.
2021-11-18 00:06:57 +01:00
henk717
27ee45b9cc
Merge pull request #31 from VE-FORBRYDERNE/cpu
Fix gen_in device logic in generate()
2021-11-17 22:42:31 +01:00
Gnome Ann
2f0b673b28 Fix gen_in device logic in generate() 2021-11-17 16:37:37 -05:00
henk717
e71271933a
Merge pull request #29 from VE-FORBRYDERNE/hidden-size
Fix hidden size detection for GPTJForCausalLM
2021-11-17 22:30:24 +01:00
Gnome Ann
a1bc10246c Support for multiple gens per action with dynamic scan 2021-11-17 16:17:59 -05:00
Gnome Ann
98a72e34a4 Replace slashes in model name with underscores 2021-11-17 15:36:36 -05:00
Gnome Ann
ab1a65f13a Fix hidden size detection for GPTJForCausalLM 2021-11-15 11:56:02 -05:00
Gnome Ann
17d07b280a Correct gpu_layers to gpu_blocks 2021-11-14 21:08:49 -05:00
Gnome Ann
805cb0c8b9 Make sure device_config() still works with all layers on CPU 2021-11-14 18:46:00 -05:00
Gnome Ann
80aee07816 Use old GPU-only generation if all layers are on the same GPU
Apparently, this mode uses less RAM than breakmodel does.
2021-11-14 18:42:18 -05:00
Gnome Ann
b0ab30cec4 Re-enable GPU-only generation option 2021-11-14 18:24:51 -05:00
henk717
3e38b462c6 Hidden Size fix for GPT2 Custom
Replaced the JS Hidden Size load with the newer function to fix these models
2021-11-14 16:40:04 +01:00
henk717
ecea169553 Improved Unix Support
Changes the line-endings to the Unix format and sets KoboldAI to launch with Python3 if executed directly.

(cherry picked from commit 5b0977ceb6807c0f80ce6717891ef5e23c8eeb77)
2021-11-13 21:54:32 -05:00
henk717
1596a238f7 Breakmodel automation
The only changes are a small addition to the breakmodel section where GPU0 is automatically chosen if the CLI options are used without specifying breakmodel. Lineendings have been changed to Linux formatting for compatibility reasons.
2021-11-14 03:13:52 +01:00
henk717
8a916116e3
Remove device=0 because of incompatibility
Device=0 breaks some of the pytorch implementations, removed to restore hardware compatibility to 0.16 levels.
2021-11-14 02:33:27 +01:00
henk717
4bcffc614e
Allow directly running KoboldAI from CLI in Linux
Its made for Python3, so we assume python3 is installed in its usual location. If it isn't you can always run it yourself with whatever command you used prior to this change.
2021-11-14 01:57:43 +01:00
henk717
21ae45e9ab
Merge branch 'KoboldAI:main' into united 2021-11-11 17:05:39 +01:00
Gnome Ann
1fadcbe1e3 Send allowsp command on connect instead of on startup 2021-11-11 00:18:46 -05:00
Gnome Ann
2fe815e092 Don't broadcast emit calls inside do_connect()
This prevents the "thinking" animation from appearing on top of the
submit button under certain circumstances:

* When someone connects to the KoboldAI server while the model is
  generating (occurs after generation finishes)
* Occasionally, the browser may suddenly disconnect and reconnect from
  Flask-SocketIO during generation, which causes the same problem
2021-11-11 00:14:12 -05:00
Gnome Ann
11b0291bc4 Use model.transformer.embed_dim if model.transformer.hidden_size doesn't exist 2021-11-10 17:47:14 -05:00
Gnome Ann
752e19a2bb Fix vars.modeldim not always being set 2021-11-10 17:38:30 -05:00