Commit Graph

4510 Commits

Author SHA1 Message Date
Henk
33969b5845 Basic HF code execution support 2023-05-05 17:23:01 +02:00
Henk
b1722081a5 AMD Pytorch 2.0 2023-05-05 15:12:59 +02:00
Henk
33745669dd Pytorch 2.0 2023-05-05 13:14:58 +02:00
0cc4m
4180620999 Remove unnecessary changes, move gptq detection function to 4bit.py 2023-05-04 19:52:56 +02:00
0cc4m
d48fedcbfb Fix llama 4-bit loading error 2023-05-04 18:31:37 +02:00
0cc4m
ef358fdf5a Merge remote-tracking branch 'origin/united' into model-structure-update 2023-05-04 07:31:13 +02:00
0cc4m
1166c07bc3 Merge latestgptq, fix conflicts 2023-05-04 07:30:49 +02:00
Bogdan Drema
91463a4d97 feat: llama config and updated mtj requirement 2023-05-04 01:47:41 +01:00
somebody
35b56117e6 Basic PEFT support 2023-05-03 18:51:01 -05:00
somebody
a9ef475142 Lock safetensors in version jail
Let's have breaking changes when we expect them
2023-05-03 17:57:38 -05:00
Henk
a87d5d6f23 Remove HF's llama workaround 2023-05-03 20:18:40 +02:00
henk717
7f5242db17 Merge pull request #344 from pi6am/fix/llama-tokens
Fix/llama tokens
2023-05-03 19:07:47 +02:00
Llama
35d344b951 Remove torch dependency and more generic dim0 workaround
Remove torch dependency from hf.py
Make workaround for dimension zero values of token_ids
more generic to handle every token, not just newlines.
2023-05-03 09:48:16 -07:00
henk717
11a9f562a2 Merge pull request #346 from ebolam/united
More UI2 paste error fixes
2023-05-03 18:33:58 +02:00
0cc4m
58f0a336cb Merge upstream changes, fix conflict 2023-05-03 18:33:11 +02:00
ebolam
0c9537e910 Potential fix for putting pasted text in wrong action 2023-05-03 12:04:05 -04:00
ebolam
fa3611b994 Update to United
Update to United
2023-05-03 10:54:17 -04:00
henk717
f958f086f1 Merge pull request #343 from LostRuins/united
Added v27 of Embedded Kobold Lite, which will now be usable locally.
2023-05-03 12:30:39 +02:00
Llama
3768848548 Fix tokenization and whitespace issues with llama-derived models
Work around the 'soft' prefix space behavior of sentencepiece.
Override encode to restore the deleted HF support for decode_with_prefix_space.
Override decode to skip the soft space and return true decoded tokens.
Allow submitting chat messages with embedded newlines.
Split sentences between punctuation and whitespace, rather than after whitespace.
Also include trailing quotes and brackets after sentence stoppers.
This avoids splitting ." and .) into two tokens, for instance.
Insert whitespace at the beginning of the author's note, since sentences are
split with leading whitespace.
Remove spurious newlines at the end of chat responses.
2023-05-03 01:27:11 -07:00
Concedo
063131a2e6 Added v27 of Embedded Kobold Lite, which will now be usable locally.
Avoid modifying this file directly since it will be overwritten in future versions - submit changes to the Lite repo instead.
2023-05-03 14:53:06 +08:00
Llama
507da6fcf7 Merge pull request #30 from henk717/united
Merge large refactor from united.
2023-05-02 21:25:47 -07:00
Henk
5d1ee39250 Fix loadmodelsettings 2023-05-03 04:21:37 +02:00
henk717
724ba43dc1 Merge pull request #342 from one-some/model-structure-and-maybe-rwkv
Move overrides to better places
2023-05-03 03:34:17 +02:00
somebody
4b3b240bce Move loadmodelsettings 2023-05-02 20:33:37 -05:00
somebody
a0f4ab5c6a Move bad token grabber until after newlinemode has been deduced 2023-05-02 20:23:36 -05:00
somebody
efe268df60 Move overrides to better places 2023-05-02 20:18:33 -05:00
Henk
480919a2a7 Nicer way of serving lite 2023-05-03 01:16:02 +02:00
Henk
03e10bed82 /lite (Not functional yet) 2023-05-03 01:04:51 +02:00
Henk
de7b760048 Typo Fix 2023-05-03 01:02:50 +02:00
0cc4m
dd6644aaf0 Pytorch 2.0 (#18)
* Update huggingface.yml to Pytorch 2.0 and CUDA 11.8

* Update github docs pip wheel hub

Update ROCm requirements

* Add rocm wheel
2023-05-02 22:11:28 +02:00
0cc4m
9c3d578d6c Work on model download support 2023-05-02 21:32:20 +02:00
henk717
50c9ed3af1 Merge pull request #299 from one-some/model-structure-and-maybe-rwkv
Structure changes
2023-05-02 18:07:09 +02:00
somebody
111028642e Fix tokenizer fallback for llama 2023-05-01 19:42:52 -05:00
somebody
f6b5548131 Support safetensors in get_sharded_checkpoint_num_tensors 2023-05-01 19:15:27 -05:00
somebody
97e84928ba Download all shards correctly on aria2 and raise on bad load key 2023-05-01 18:53:36 -05:00
somebody
933dbd634a HFInferenceModel: Make badwordsids not unique to torch 2023-05-01 17:13:33 -05:00
somebody
c95be636a4 Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv 2023-05-01 17:08:20 -05:00
somebody
ce3d465972 Remove some debug 2023-05-01 17:03:34 -05:00
ebolam
5a32159e58 Remove debug prints 2023-05-01 10:53:02 -04:00
ebolam
137d056cb3 Fix for pasting text in the middle of an action 2023-05-01 10:48:45 -04:00
0cc4m
f83a0aa122 Merge latest changes, fix conflict 2023-05-01 08:01:54 +02:00
Llama
eb4e89c2fa Merge pull request #29 from henk717/united
Merge united
2023-04-30 14:20:12 -07:00
0cc4m
aa67135d42 Implement new model format
Remove 4bit toggle
2023-04-30 21:59:22 +02:00
Henk
545f79086d Ban EOS token in N mode 2023-04-30 18:48:22 +02:00
0cc4m
20a5587d66 Always use offloader script, because it speeds up multi gpu 2023-04-30 18:17:43 +02:00
henk717
61511a5714 Merge pull request #341 from TinkerTankAI/united
Update KoboldAI-Horde-Bridge to the latest version
2023-04-29 15:09:15 +02:00
Tijs Zwinkels
2ad66ebcc0 Update KoboldAI-Horde-Bridge to the latest version
This version contains a timeout on http requests,
preventing a hang in my worker.
2023-04-29 15:07:32 +02:00
0cc4m
2859c67c67 Merge remote-tracking branch 'origin/united' into latestgptq 2023-04-29 13:57:34 +02:00
henk717
b19bd9c89e Merge branch 'KoboldAI:main' into united 2023-04-29 02:45:45 +02:00
Henk
1499763472 Flask fix 2023-04-29 02:44:41 +02:00