KoboldAI-Client

mirror of https://github.com/KoboldAI/KoboldAI-Client.git synced 2025-06-05 21:59:24 +02:00

Author	SHA1	Message	Date
0cc4m	3d4d5df76b	Remove rocm wheel, because it didn't work correctly	2023-05-13 20:33:13 +02:00
0cc4m	7f7b350741	Catch further error during multigpu 4bit setup	2023-05-13 20:31:01 +02:00
0cc4m	266c0574f6	Fix 4bit pt loading, add traceback output to GPT2 fallback	2023-05-13 20:15:11 +02:00
0cc4m	a2d01bb9e4	Update to GPTQ module 0.0.2, add support for upstream cuda quantizations, automatic detection	2023-05-09 22:20:35 +02:00
0cc4m	6121598142	Fix multigpu loading without lazy-loader	2023-05-08 22:57:09 +02:00
0cc4m	4f94247910	Fix chat mode empty generation error	2023-05-08 22:56:17 +02:00
0cc4m	e55a9d31c2	Update readme, clean up gitmodules file	2023-05-08 22:55:59 +02:00
0cc4m	6b4d3218d6	Fix OOM when loading large model split across GPUs	2023-05-07 06:55:51 +02:00
0cc4m	51e6dcdcd4	Revert accidental install_requirements change	2023-05-07 06:42:32 +02:00
0cc4m	9ec50c9972	Fix 4-bit mpt	2023-05-06 21:58:23 +02:00
0cc4m	a9fa199c49	Rename gptq module, pull fix	2023-05-06 21:30:33 +02:00
0cc4m	4a14c6a446	Merge pull request #10 from 0cc4m/model-structure-update Model structure update	2023-05-06 20:55:16 +02:00
0cc4m	2f7856f0d1	Use GPTQ python module, add MPT quantized support	2023-05-06 20:52:42 +02:00
Henk	dedf2afeb3	More max_context_length flexibility	2023-05-05 20:09:51 +02:00
0cc4m	43b0afc7a8	Add safe MPT support	2023-05-05 20:07:10 +02:00
0cc4m	4180620999	Remove unnecessary changes, move gptq detection function to 4bit.py	2023-05-04 19:52:56 +02:00
0cc4m	d48fedcbfb	Fix llama 4-bit loading error	2023-05-04 18:31:37 +02:00
0cc4m	ef358fdf5a	Merge remote-tracking branch 'origin/united' into model-structure-update	2023-05-04 07:31:13 +02:00
0cc4m	1166c07bc3	Merge latestgptq, fix conflicts	2023-05-04 07:30:49 +02:00
Henk	a87d5d6f23	Remove HF's llama workaround	2023-05-03 20:18:40 +02:00
henk717	7f5242db17	Merge pull request #344 from pi6am/fix/llama-tokens Fix/llama tokens	2023-05-03 19:07:47 +02:00
Llama	35d344b951	Remove torch dependency and more generic dim0 workaround Remove torch dependency from hf.py Make workaround for dimension zero values of token_ids more generic to handle every token, not just newlines.	2023-05-03 09:48:16 -07:00
henk717	11a9f562a2	Merge pull request #346 from ebolam/united More UI2 paste error fixes	2023-05-03 18:33:58 +02:00
0cc4m	58f0a336cb	Merge upstream changes, fix conflict	2023-05-03 18:33:11 +02:00
ebolam	0c9537e910	Potential fix for putting pasted text in wrong action	2023-05-03 12:04:05 -04:00
ebolam	fa3611b994	Update to United Update to United	2023-05-03 10:54:17 -04:00
henk717	f958f086f1	Merge pull request #343 from LostRuins/united Added v27 of Embedded Kobold Lite, which will now be usable locally.	2023-05-03 12:30:39 +02:00
Llama	3768848548	Fix tokenization and whitespace issues with llama-derived models Work around the 'soft' prefix space behavior of sentencepiece. Override encode to restore the deleted HF support for decode_with_prefix_space. Override decode to skip the soft space and return true decoded tokens. Allow submitting chat messages with embedded newlines. Split sentences between punctuation and whitespace, rather than after whitespace. Also include trailing quotes and brackets after sentence stoppers. This avoids splitting ." and .) into two tokens, for instance. Insert whitespace at the beginning of the author's note, since sentences are split with leading whitespace. Remove spurious newlines at the end of chat responses.	2023-05-03 01:27:11 -07:00
Concedo	063131a2e6	Added v27 of Embedded Kobold Lite, which will now be usable locally. Avoid modifying this file directly since it will be overwritten in future versions - submit changes to the Lite repo instead.	2023-05-03 14:53:06 +08:00
Llama	507da6fcf7	Merge pull request #30 from henk717/united Merge large refactor from united.	2023-05-02 21:25:47 -07:00
Henk	5d1ee39250	Fix loadmodelsettings	2023-05-03 04:21:37 +02:00
henk717	724ba43dc1	Merge pull request #342 from one-some/model-structure-and-maybe-rwkv Move overrides to better places	2023-05-03 03:34:17 +02:00
somebody	4b3b240bce	Move loadmodelsettings	2023-05-02 20:33:37 -05:00
somebody	a0f4ab5c6a	Move bad token grabber until after newlinemode has been deduced	2023-05-02 20:23:36 -05:00
somebody	efe268df60	Move overrides to better places	2023-05-02 20:18:33 -05:00
Henk	480919a2a7	Nicer way of serving lite	2023-05-03 01:16:02 +02:00
Henk	03e10bed82	/lite (Not functional yet)	2023-05-03 01:04:51 +02:00
Henk	de7b760048	Typo Fix	2023-05-03 01:02:50 +02:00
0cc4m	dd6644aaf0	Pytorch 2.0 (#18 ) * Update huggingface.yml to Pytorch 2.0 and CUDA 11.8 * Update github docs pip wheel hub Update ROCm requirements * Add rocm wheel	2023-05-02 22:11:28 +02:00
0cc4m	9c3d578d6c	Work on model download support	2023-05-02 21:32:20 +02:00
henk717	50c9ed3af1	Merge pull request #299 from one-some/model-structure-and-maybe-rwkv Structure changes	2023-05-02 18:07:09 +02:00
somebody	111028642e	Fix tokenizer fallback for llama	2023-05-01 19:42:52 -05:00
somebody	f6b5548131	Support safetensors in get_sharded_checkpoint_num_tensors	2023-05-01 19:15:27 -05:00
somebody	97e84928ba	Download all shards correctly on aria2 and raise on bad load key	2023-05-01 18:53:36 -05:00
somebody	933dbd634a	HFInferenceModel: Make badwordsids not unique to torch	2023-05-01 17:13:33 -05:00
somebody	c95be636a4	Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv	2023-05-01 17:08:20 -05:00
somebody	ce3d465972	Remove some debug	2023-05-01 17:03:34 -05:00
ebolam	5a32159e58	Remove debug prints	2023-05-01 10:53:02 -04:00
ebolam	137d056cb3	Fix for pasting text in the middle of an action	2023-05-01 10:48:45 -04:00
0cc4m	f83a0aa122	Merge latest changes, fix conflict	2023-05-01 08:01:54 +02:00

1 2 3 4 5 ...

4066 Commits