KoboldAI-Client

mirror of https://github.com/KoboldAI/KoboldAI-Client.git synced 2025-06-05 21:59:24 +02:00

Author	SHA1	Message	Date
Llama	554af7b175	Modify exllama to load unrenamed gptq quantized models Read config.json and enable exllama loading if the model has a `quantization_config` with `quant_methdod` of `gptq`. Note that this implementation is limited and only supports model.safetensors. That said, this supports loading popular gptq quantized models without renaming or symlinking the model file.	2023-08-27 23:56:02 -07:00
Llama	812df5ea56	Merge pull request #65 from pi6am/feat/exllama-badwords Add the eos token to exllama bad words.	2023-08-27 17:03:25 -07:00
Llama	08ff7c138c	Add the eos token to exllama bad words. The bos token was already hardcoded as a bad word id. Store badwords in a list and iterate over them during generation. Add the Llama eos token to the list of bad words. Also support "single line mode", which adds newline (13) to badwords.	2023-08-27 16:34:52 -07:00
Henk	3e0b8279f2	Rename GPTQ loading	2023-08-27 20:51:14 +02:00
Llama	0d150e412e	Merge pull request #64 from pi6am/fix/multinomial-workaround Resample to work around a bug in torch.multinomial	2023-08-26 22:42:21 -07:00
Llama	b7e38b4757	Resample to work around a bug in torch.multinomial There is a bug in PyTorch 2.0.1 that allows torch.multinomial to sometimes choose elements that have zero probability. Since this is uncommon we can continue to use torch.multinomial as long as we verify that the results are valid. If they aren't, try again until the probability of each selected token is positive.	2023-08-26 22:26:26 -07:00
Henk	290f2ce05e	CPU only warning	2023-08-26 00:03:28 +02:00
db0	4b2d591354	avoid conflictinng sys args	2023-08-25 15:05:36 +02:00
Henk	f40236c04a	Modern llama tokenizer	2023-08-25 14:27:44 +02:00
Henk	2887467eec	Safetensors 0.3.3	2023-08-24 14:30:44 +02:00
henk717	d86f61151b	Working revision support	2023-08-23 22:07:37 +02:00
Henk	39c1b39b4a	Fix markers	2023-08-23 21:42:06 +02:00
Henk	5d9f180489	Fix typo	2023-08-23 21:36:26 +02:00
Henk	85810cd3fd	AutoGPTQ for Colab	2023-08-23 21:30:58 +02:00
Henk	c20ea949d7	Fix duplicate safetensors	2023-08-23 21:02:11 +02:00
Henk	91155ed2f3	HF dependencies	2023-08-23 20:34:40 +02:00
Llama	b1895de518	Merge pull request #63 from pi6am/feat/exllama-stoppers Add stopper hooks suppport to exllama	2023-08-22 23:14:00 -07:00
Llama	b96d5d8646	Add stopper hooks suppport to exllama	2023-08-22 23:06:16 -07:00
Henk	f66173f2a0	Git gonna git	2023-08-22 20:43:44 +02:00
Henk	2b6dcbe55e	New Horde Worker	2023-08-22 20:40:04 +02:00
Henk	3f438fda53	Scribe name instead of worker name	2023-08-22 18:56:23 +02:00
Henk	e5aca6fdad	Cleaned horde	2023-08-22 18:43:29 +02:00
Henk	69c794506b	HF 4.32	2023-08-22 17:48:00 +02:00
Henk	4b482a0619	Pending trick	2023-08-22 14:58:44 +02:00
Henk	b41bf99b55	Cleanup	2023-08-22 14:00:05 +02:00
Henk	179c4ad07f	Restore UI	2023-08-22 13:58:02 +02:00
Henk	f570787077	Allow worker to stop	2023-08-22 13:38:28 +02:00
0cc4m	22fd49937a	Merge pull request #62 from pi6am/fix/exllama-eos-space Strip the eos token from exllama generations.	2023-08-22 08:06:02 +02:00
Henk	401cc1609a	Kaiemb branch	2023-08-21 19:05:35 +02:00
db0	148a7c21b8	using stop()	2023-08-21 19:02:15 +02:00
Henk	d9815d4b1f	New worker fixes	2023-08-21 17:52:21 +02:00
Henk	8abb5746f8	Add bridge back	2023-08-21 16:50:17 +02:00
Henk	7b8fba31f7	Git is stubborn	2023-08-21 16:46:02 +02:00
Henk	be8f527911	Horde URL fixes	2023-08-21 16:44:58 +02:00
Henk	a7251fa599	Bridge settings	2023-08-21 16:44:09 +02:00
Henk	e2d56db195	Fix bridge reference	2023-08-21 16:27:53 +02:00
db0	a655f8f066	adjust for stop mechanism	2023-08-21 15:56:27 +02:00
db0	45661ddc75	switch to AI Horde Worker	2023-08-21 15:52:17 +02:00
Henk	955db1567e	Keep the usual temp folder instead of ours	2023-08-21 14:29:37 +02:00
Henk	57e5f51d63	AutoGPTQ for Colab	2023-08-21 14:08:14 +02:00
Henk	5917737676	Don't disable exllama	2023-08-21 13:17:30 +02:00
Henk	8daa2f1adc	Update Optimum on Git HF	2023-08-21 02:01:34 +02:00
Henk	3dd0e91fbb	Preliminary HF GPTQ changes	2023-08-21 01:58:52 +02:00
Llama	070cfd339a	Strip the eos token from exllama generations. The end-of-sequence (</s>) token indicates the end of a generation. When a token sequence containing </s> is decoded, an extra (wrong) space is inserted at the beginning of the generation. To avoid this, strip the eos token out of the result before returning it. The eos token was getting stripped later, so this doesn't change the output except to avoid the spurious leading space.	2023-08-19 17:40:23 -07:00
Henk	6f557befa9	GPTQ --revision support	2023-08-19 15:17:29 +02:00
Henk	d93631c889	GPTQ improvements	2023-08-19 14:45:45 +02:00
Henk	13b68c67d1	Basic GPTQ Downloader	2023-08-19 13:02:50 +02:00
henk717	029e8736c0	Merge pull request #438 from one-some/another-api-fix More api fixes	2023-08-19 02:43:38 +02:00
somebody	45486a47b0	WI: Fix UID keys being str ...again	2023-08-18 19:27:02 -05:00
Henk	80e784d3ea	Polish	2023-08-19 01:39:31 +02:00

1 2 3 4 5 ...

4698 Commits