KoboldAI-Client

mirror of https://github.com/KoboldAI/KoboldAI-Client.git synced 2025-06-05 21:59:24 +02:00

Author	SHA1	Message	Date
0cc4m	43b0afc7a8	Add safe MPT support	2023-05-05 20:07:10 +02:00
0cc4m	4180620999	Remove unnecessary changes, move gptq detection function to 4bit.py	2023-05-04 19:52:56 +02:00
0cc4m	d48fedcbfb	Fix llama 4-bit loading error	2023-05-04 18:31:37 +02:00
0cc4m	ef358fdf5a	Merge remote-tracking branch 'origin/united' into model-structure-update	2023-05-04 07:31:13 +02:00
Henk	a87d5d6f23	Remove HF's llama workaround	2023-05-03 20:18:40 +02:00
Llama	35d344b951	Remove torch dependency and more generic dim0 workaround Remove torch dependency from hf.py Make workaround for dimension zero values of token_ids more generic to handle every token, not just newlines.	2023-05-03 09:48:16 -07:00
0cc4m	58f0a336cb	Merge upstream changes, fix conflict	2023-05-03 18:33:11 +02:00
Llama	3768848548	Fix tokenization and whitespace issues with llama-derived models Work around the 'soft' prefix space behavior of sentencepiece. Override encode to restore the deleted HF support for decode_with_prefix_space. Override decode to skip the soft space and return true decoded tokens. Allow submitting chat messages with embedded newlines. Split sentences between punctuation and whitespace, rather than after whitespace. Also include trailing quotes and brackets after sentence stoppers. This avoids splitting ." and .) into two tokens, for instance. Insert whitespace at the beginning of the author's note, since sentences are split with leading whitespace. Remove spurious newlines at the end of chat responses.	2023-05-03 01:27:11 -07:00
henk717	724ba43dc1	Merge pull request #342 from one-some/model-structure-and-maybe-rwkv Move overrides to better places	2023-05-03 03:34:17 +02:00
somebody	a0f4ab5c6a	Move bad token grabber until after newlinemode has been deduced	2023-05-02 20:23:36 -05:00
somebody	efe268df60	Move overrides to better places	2023-05-02 20:18:33 -05:00
Henk	de7b760048	Typo Fix	2023-05-03 01:02:50 +02:00
0cc4m	9c3d578d6c	Work on model download support	2023-05-02 21:32:20 +02:00
somebody	111028642e	Fix tokenizer fallback for llama	2023-05-01 19:42:52 -05:00
somebody	f6b5548131	Support safetensors in get_sharded_checkpoint_num_tensors	2023-05-01 19:15:27 -05:00
somebody	97e84928ba	Download all shards correctly on aria2 and raise on bad load key	2023-05-01 18:53:36 -05:00
somebody	933dbd634a	HFInferenceModel: Make badwordsids not unique to torch	2023-05-01 17:13:33 -05:00
somebody	ce3d465972	Remove some debug	2023-05-01 17:03:34 -05:00
0cc4m	f83a0aa122	Merge latest changes, fix conflict	2023-05-01 08:01:54 +02:00
0cc4m	aa67135d42	Implement new model format Remove 4bit toggle	2023-04-30 21:59:22 +02:00
0cc4m	20a5587d66	Always use offloader script, because it speeds up multi gpu	2023-04-30 18:17:43 +02:00
one-some	455b8257a9	Implement softprompt hack	2023-04-28 10:26:59 -05:00
somebody	ace4364339	Two more time	2023-04-27 21:13:26 -05:00
somebody	446f38ee9d	One more time	2023-04-27 21:07:34 -05:00
somebody	2eee535540	Actually fix decoding with soft prompts it really wants a tensor	2023-04-27 21:01:12 -05:00
somebody	ffa7b22734	Experiment	2023-04-27 20:28:04 -05:00
somebody	cd1eb97c2a	Debuuuug	2023-04-27 20:12:29 -05:00
somebody	4559112551	Potential fix	2023-04-27 19:51:10 -05:00
somebody	b256a8fbc7	Debug	2023-04-27 19:33:03 -05:00
onesome	467f2f25eb	More loading fixes	2023-04-26 16:58:33 -05:00
onesome	d4f7b60dc9	Fix for multiple paths	2023-04-26 16:49:12 -05:00
onesome	6776a71532	Add more info to custom model error	2023-04-26 16:36:52 -05:00
onesome	bbf4963d6e	Fix custmodpth stuff for hf loading	2023-04-26 16:18:45 -05:00
onesome	c146ae9d84	Delete legacy gpt2 custom loader	2023-04-26 16:07:18 -05:00
onesome	9579298df7	Better fallback	2023-04-25 22:28:07 -05:00
onesome	6e3aebc1ea	Zap debug	2023-04-25 21:13:17 -05:00
onesome	d496e861f4	Undo pretty code because I haven't cracked the jax enigma yet	2023-04-25 21:11:49 -05:00
onesome	1db9d9ba61	Lazyload: Whoops	2023-04-25 18:46:54 -05:00
onesome	e28e268a2d	Use safetensors only when available	2023-04-25 18:32:37 -05:00
onesome	0268305cfe	Change fallback notifications to warnings	2023-04-25 18:26:49 -05:00
onesome	b8bef641ff	Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv	2023-04-25 16:54:53 -05:00
0cc4m	934571857b	Fix offloading	2023-04-18 22:52:54 +02:00
0cc4m	1ef515f4c2	Fix lazy-loading on 4-bit	2023-04-17 07:21:18 +02:00
0cc4m	4d34f9b7de	Move 4-bit loading code to separate inference_model file	2023-04-16 14:20:13 +02:00
somebody	f9fb5eba89	Remove debug	2023-04-15 18:56:49 -05:00
somebody	5dd67d027a	Workaround for socketio context errors for loading	2023-04-15 18:54:21 -05:00
somebody	08b4e317ff	Fix double slashing	2023-04-15 13:30:05 -05:00
somebody	d3a73aaeba	Fix api	2023-04-15 13:17:20 -05:00
somebody	4dcf570407	Fix legacy model loading	2023-04-15 12:57:35 -05:00
one-some	1b500c7179	Merge pull request #5 from LostRuins/concedo_api Added stop sequences functionality for API calls	2023-04-15 10:51:31 -05:00

1 2

88 Commits