KoboldAI-Client

mirror of https://github.com/KoboldAI/KoboldAI-Client.git synced 2025-06-05 21:59:24 +02:00

Author	SHA1	Message	Date
Llama	070cfd339a	Strip the eos token from exllama generations. The end-of-sequence (</s>) token indicates the end of a generation. When a token sequence containing </s> is decoded, an extra (wrong) space is inserted at the beginning of the generation. To avoid this, strip the eos token out of the result before returning it. The eos token was getting stripped later, so this doesn't change the output except to avoid the spurious leading space.	2023-08-19 17:40:23 -07:00
0cc4m	973aea12ea	Only import big python modules for GPTQ once they get used	2023-07-23 22:07:34 +02:00
0cc4m	49740aa5ab	Fix ntk alpha	2023-07-23 21:56:48 +02:00
0cc4m	31a984aa3d	Automatically install exllama module	2023-07-23 07:33:51 +02:00
0cc4m	a9aa04fd1b	Merge remote-tracking branch 'upstream/united' into 4bit-plugin	2023-07-23 07:18:58 +02:00
0cc4m	09bb1021dd	Fallback to transformers if hf_bleeding_edge not available	2023-07-23 07:16:52 +02:00
0cc4m	748e5ef318	Add sliders for exllama context size and related methods	2023-07-23 07:11:28 +02:00
Henk	7a5d813b92	Reimplement HF workaround only for llama	2023-07-22 16:59:49 +02:00
Henk	8dd7b93a6c	HF's workaround breaks stuff	2023-07-22 16:29:55 +02:00
Henk	fa9d17b3d3	HF 4.31	2023-07-22 15:25:14 +02:00
Henk	7823da564e	Link to Lite	2023-07-22 04:04:17 +02:00
henk717	83e5c29260	Merge pull request #413 from one-some/bug-hunt Fix WI comment editing	2023-07-22 00:34:46 +02:00
somebody	e68972a270	Fix WI comments	2023-07-21 16:14:13 -05:00
Henk	a17d7aae60	Easier english	2023-07-21 19:42:49 +02:00
Henk	da9b54ec1c	Don't show API link during load	2023-07-21 19:31:38 +02:00
Henk	432cdc9a08	Fix models with good pad tokens	2023-07-21 16:39:58 +02:00
Henk	ec745d8b80	Dont accidentally block pad tokens	2023-07-21 16:25:32 +02:00
henk717	dc4404f29c	Merge pull request #409 from nkpz/bnb8bit Configurable quantization level, fix for broken toggles in model settings	2023-07-19 14:22:44 +02:00
Nick Perez	9581e51476	feature(load model): select control for quantization level	2023-07-19 07:58:12 -04:00
0cc4m	58908ab846	Revert aiserver.py changes	2023-07-19 07:14:03 +02:00
0cc4m	19f511dc9f	Load GPTQ module from GPTQ repo docs	2023-07-19 07:12:37 +02:00
0cc4m	1c5da2bbf3	Move pip docs from KoboldAI into GPTQ repo	2023-07-19 07:08:39 +02:00
0cc4m	7516ecf00d	Merge upstream changes, fix conflict	2023-07-19 07:02:29 +02:00
0cc4m	c84d063be8	Revert settings changes	2023-07-19 07:01:11 +02:00
0cc4m	9aa6c5fbbf	Merge upstream changes, fix conflict, adapt backends to changes	2023-07-19 06:56:09 +02:00
Nick Perez	0142913060	8 bit toggle, fix for broken toggle values	2023-07-18 23:29:38 -04:00
Henk	22e7baec52	Permit CPU layers on 4-bit (Worse than GGML)	2023-07-18 21:44:34 +02:00
henk717	5f2600d338	Merge pull request #406 from ebolam/Model_Plugins Clarified message on what's required for model backend parameters	2023-07-18 02:42:23 +02:00
ebolam	66192efdb7	Clarified message on what's required for model backend parameters in the command line	2023-07-17 20:30:41 -04:00
Henk	5bbcdc47da	4-bit on Colab	2023-07-18 01:48:01 +02:00
henk717	da9226fba5	Merge pull request #401 from ebolam/Model_Plugins Save the 4-bit flag to the model settings.	2023-07-18 01:19:43 +02:00
henk717	fee79928c8	Merge pull request #404 from one-some/united Delete basic 4bit	2023-07-18 01:19:14 +02:00
somebody	1637760fa1	Delete basic 4bit And add code to handle dangling __pycache__s	2023-07-17 18:16:03 -05:00
henk717	5c3a8e295a	Merge pull request #402 from one-some/united Patches: Make lazyload work with quantization	2023-07-17 23:53:14 +02:00
somebody	23b95343bd	Patches: Make lazyload work on quantized i wanna watch youtube while my model is loading without locking up my system >:(	2023-07-17 16:47:31 -05:00
ebolam	4acf9235db	Merge branch 'Model_Plugins' of https://github.com/ebolam/KoboldAI into Model_Plugins	2023-07-17 09:52:10 -04:00
ebolam	b9ee6e336a	Save the 4-bit flag to the model settings.	2023-07-17 09:50:03 -04:00
ebolam	66377fc09e	Save the 4-bit flag to the model settings.	2023-07-17 09:48:01 -04:00
henk717	e8d84bb787	Merge pull request #400 from ebolam/Model_Plugins missed the elif	2023-07-17 15:16:34 +02:00
ebolam	eafb699bbf	missed the elif	2023-07-17 09:12:45 -04:00
henk717	a3b0c6dd60	Merge pull request #399 from ebolam/Model_Plugins Update to the upload_file function	2023-07-17 15:11:40 +02:00
ebolam	bfb26ab55d	Ban uploading to the modeling directory	2023-07-17 09:05:22 -04:00
ebolam	52e061d0f9	Fix for potential jailbreak	2023-07-17 08:55:23 -04:00
henk717	f7561044c6	Merge pull request #398 from Alephrin/patch-1 Speeds up bnb 4bit with a custom BitsAndBytesConfig	2023-07-17 13:22:44 +02:00
Alephrin	145a43a000	Removed extra load_in_4bit.	2023-07-17 04:53:47 -06:00
Alephrin	e9913d657a	Speeds up bnb 4bit with a custom BitsAndBytesConfig With this BitsAndBytesConfig I get about double the speed compared to running without it. (Tested on llama 13B with a 3090)	2023-07-17 04:43:43 -06:00
Henk	6d7e9e6771	Post4 BnB for Linux	2023-07-16 02:13:42 +02:00
Henk	8bef2e5fef	Fixes 16-bit if BnB is not installed	2023-07-16 02:02:58 +02:00
henk717	fac006125e	Merge pull request #397 from ebolam/Model_Plugins Fixes for model backend UI	2023-07-15 23:58:24 +02:00
0cc4m	e78361fc8f	Pull upstream changes, fix conflicts	2023-07-15 23:01:52 +02:00

1 2 3 4 5 ...

4401 Commits