Commit Graph

4188 Commits

Author SHA1 Message Date
Henk
e932364a1e RWKV support 2023-05-11 14:56:12 +02:00
Henk
84e4cb0f4a Update Transformers 2023-05-11 13:44:53 +02:00
somebody
546ba84723 Fix memory->genre bug in context viewer bar tooltip
Crazy change I know
2023-05-10 19:10:23 -05:00
ebolam
71aee4dbd8 First concept of model plugins with a conceptual UI.
Completely breaks UI2 model loading.
2023-05-10 16:30:46 -04:00
0cc4m
a2d01bb9e4 Update to GPTQ module 0.0.2, add support for upstream cuda quantizations, automatic detection 2023-05-09 22:20:35 +02:00
Henk
702f59b2db Downgrade ROCM properly 2023-05-09 22:10:01 +02:00
Henk
9fdc2f73a6 ROCM Downgrade for stability 2023-05-09 20:59:10 +02:00
henk717
a649bd8d3c Merge pull request #357 from one-some/tpu-api-fix
Fix TPU API errors
2023-05-09 01:05:20 +02:00
somebody
a9e342ca64 Fix TPU API errors 2023-05-08 17:34:59 -05:00
0cc4m
6121598142 Fix multigpu loading without lazy-loader 2023-05-08 22:57:09 +02:00
0cc4m
4f94247910 Fix chat mode empty generation error 2023-05-08 22:56:17 +02:00
0cc4m
e55a9d31c2 Update readme, clean up gitmodules file 2023-05-08 22:55:59 +02:00
henk717
0f9129859f Merge pull request #353 from Zurnaz/llama_tpu_tokenizer_fix
fix: tpu tokenizers errors
2023-05-08 19:41:33 +02:00
Bogdan Drema
d53726bed6 fix: tpu tokenizers errors 2023-05-08 18:24:34 +01:00
henk717
cb4af7e56e Update requirements_mtj.txt 2023-05-08 17:23:49 +02:00
0cc4m
6b4d3218d6 Fix OOM when loading large model split across GPUs 2023-05-07 06:55:51 +02:00
0cc4m
51e6dcdcd4 Revert accidental install_requirements change 2023-05-07 06:42:32 +02:00
0cc4m
9ec50c9972 Fix 4-bit mpt 2023-05-06 21:58:23 +02:00
0cc4m
a9fa199c49 Rename gptq module, pull fix 2023-05-06 21:30:33 +02:00
0cc4m
4a14c6a446 Merge pull request #10 from 0cc4m/model-structure-update
Model structure update
2023-05-06 20:55:16 +02:00
0cc4m
2f7856f0d1 Use GPTQ python module, add MPT quantized support 2023-05-06 20:52:42 +02:00
Henk
bb206f598e Don't load peft when unused 2023-05-06 18:55:26 +02:00
henk717
19092827aa Merge pull request #351 from one-some/peft
Change PEFT directory structure to be inside model
2023-05-06 18:43:57 +02:00
somebody
b7db709c47 PEFT: Change directory structure to be inside model 2023-05-06 11:16:09 -05:00
henk717
472c2c8cbc Merge pull request #348 from one-some/peft
Basic PEFT support
2023-05-06 17:53:51 +02:00
somebody
f02ddab7c7 Merge branch 'united' of https://github.com/henk717/KoboldAI into peft 2023-05-06 10:47:14 -05:00
henk717
04592e5086 Merge pull request #349 from Zurnaz/llama_config
feat: llama config and updated mtj requirement
2023-05-06 16:52:22 +02:00
Henk
2730879c61 Better warning until something more robust is in 2023-05-05 21:28:06 +02:00
Henk
dedf2afeb3 More max_context_length flexibility 2023-05-05 20:09:51 +02:00
0cc4m
43b0afc7a8 Add safe MPT support 2023-05-05 20:07:10 +02:00
Henk
d508b4a319 More max_context_length flexibility 2023-05-05 19:50:56 +02:00
Henk
33969b5845 Basic HF code execution support 2023-05-05 17:23:01 +02:00
Henk
b1722081a5 AMD Pytorch 2.0 2023-05-05 15:12:59 +02:00
Henk
33745669dd Pytorch 2.0 2023-05-05 13:14:58 +02:00
0cc4m
4180620999 Remove unnecessary changes, move gptq detection function to 4bit.py 2023-05-04 19:52:56 +02:00
0cc4m
d48fedcbfb Fix llama 4-bit loading error 2023-05-04 18:31:37 +02:00
0cc4m
ef358fdf5a Merge remote-tracking branch 'origin/united' into model-structure-update 2023-05-04 07:31:13 +02:00
0cc4m
1166c07bc3 Merge latestgptq, fix conflicts 2023-05-04 07:30:49 +02:00
Bogdan Drema
91463a4d97 feat: llama config and updated mtj requirement 2023-05-04 01:47:41 +01:00
somebody
35b56117e6 Basic PEFT support 2023-05-03 18:51:01 -05:00
somebody
a9ef475142 Lock safetensors in version jail
Let's have breaking changes when we expect them
2023-05-03 17:57:38 -05:00
Henk
a87d5d6f23 Remove HF's llama workaround 2023-05-03 20:18:40 +02:00
henk717
7f5242db17 Merge pull request #344 from pi6am/fix/llama-tokens
Fix/llama tokens
2023-05-03 19:07:47 +02:00
Llama
35d344b951 Remove torch dependency and more generic dim0 workaround
Remove torch dependency from hf.py
Make workaround for dimension zero values of token_ids
more generic to handle every token, not just newlines.
2023-05-03 09:48:16 -07:00
henk717
11a9f562a2 Merge pull request #346 from ebolam/united
More UI2 paste error fixes
2023-05-03 18:33:58 +02:00
0cc4m
58f0a336cb Merge upstream changes, fix conflict 2023-05-03 18:33:11 +02:00
ebolam
0c9537e910 Potential fix for putting pasted text in wrong action 2023-05-03 12:04:05 -04:00
ebolam
fa3611b994 Update to United
Update to United
2023-05-03 10:54:17 -04:00
henk717
f958f086f1 Merge pull request #343 from LostRuins/united
Added v27 of Embedded Kobold Lite, which will now be usable locally.
2023-05-03 12:30:39 +02:00
Llama
3768848548 Fix tokenization and whitespace issues with llama-derived models
Work around the 'soft' prefix space behavior of sentencepiece.
Override encode to restore the deleted HF support for decode_with_prefix_space.
Override decode to skip the soft space and return true decoded tokens.
Allow submitting chat messages with embedded newlines.
Split sentences between punctuation and whitespace, rather than after whitespace.
Also include trailing quotes and brackets after sentence stoppers.
This avoids splitting ." and .) into two tokens, for instance.
Insert whitespace at the beginning of the author's note, since sentences are
split with leading whitespace.
Remove spurious newlines at the end of chat responses.
2023-05-03 01:27:11 -07:00