KoboldAI-Client

mirror of https://github.com/KoboldAI/KoboldAI-Client.git synced 2025-06-05 21:59:24 +02:00

Author	SHA1	Message	Date
0cc4m	09bb1021dd	Fallback to transformers if hf_bleeding_edge not available	2023-07-23 07:16:52 +02:00
0cc4m	748e5ef318	Add sliders for exllama context size and related methods	2023-07-23 07:11:28 +02:00
0cc4m	7516ecf00d	Merge upstream changes, fix conflict	2023-07-19 07:02:29 +02:00
0cc4m	9aa6c5fbbf	Merge upstream changes, fix conflict, adapt backends to changes	2023-07-19 06:56:09 +02:00
Henk	22e7baec52	Permit CPU layers on 4-bit (Worse than GGML)	2023-07-18 21:44:34 +02:00
Henk	5bbcdc47da	4-bit on Colab	2023-07-18 01:48:01 +02:00
henk717	da9226fba5	Merge pull request #401 from ebolam/Model_Plugins Save the 4-bit flag to the model settings.	2023-07-18 01:19:43 +02:00
somebody	1637760fa1	Delete basic 4bit And add code to handle dangling __pycache__s	2023-07-17 18:16:03 -05:00
somebody	23b95343bd	Patches: Make lazyload work on quantized i wanna watch youtube while my model is loading without locking up my system >:(	2023-07-17 16:47:31 -05:00
ebolam	b9ee6e336a	Save the 4-bit flag to the model settings.	2023-07-17 09:50:03 -04:00
Alephrin	145a43a000	Removed extra load_in_4bit.	2023-07-17 04:53:47 -06:00
Alephrin	e9913d657a	Speeds up bnb 4bit with a custom BitsAndBytesConfig With this BitsAndBytesConfig I get about double the speed compared to running without it. (Tested on llama 13B with a 3090)	2023-07-17 04:43:43 -06:00
Henk	8bef2e5fef	Fixes 16-bit if BnB is not installed	2023-07-16 02:02:58 +02:00
0cc4m	e78361fc8f	Pull upstream changes, fix conflicts	2023-07-15 23:01:52 +02:00
Henk	0622810bc4	Better way of doing the if statement	2023-07-15 20:00:29 +02:00
Henk	23a104a4fe	Only show 4-bit toggle on valid model	2023-07-15 19:42:26 +02:00
Henk	71b6e8d6d4	Fix accidental parameters overwrite	2023-07-15 19:35:40 +02:00
Henk	c43d60772b	BnB dependency check	2023-07-15 18:56:13 +02:00
Henk	160effb9ea	Add 4-bit BnB toggle	2023-07-15 18:20:10 +02:00
Henk	2c50d5d092	Don't ruin breakmodel	2023-07-15 14:14:06 +02:00
Henk	1f045110a4	Basic 4-bit backend	2023-07-15 02:49:31 +02:00
onesome	afa8766ea6	Add is_valid	2023-07-14 18:01:18 -05:00
somebody	f67cb7fa05	Make basic hf independant from hf	2023-07-12 18:36:30 -05:00
somebody	d17ce8461d	Use device_map="auto"	2023-07-12 17:27:48 -05:00
somebody	60473d4c23	Fix and add some documentation to basic hf backend	2023-07-12 17:16:05 -05:00
onesome	8077d6c3f9	Self-contained sampler patch (Don't merge) Completely untested 3:00 AM code; beware! I will test and add more documentation tomorrow.	2023-07-12 03:22:43 -05:00
somebody	20b4b4bcef	Add basic hf backend	2023-07-08 17:12:16 -05:00
somebody	3928d86339	Fall back to unpatched HF	2023-07-08 14:36:45 -05:00
somebody	c2ee30af32	Add --panic to raise when loading fails	2023-07-08 14:04:46 -05:00
Henk	16240878bc	Restore --peft support	2023-07-04 20:42:29 +02:00
somebody	bce1a907e5	Update aux device to depend on primary device	2023-07-03 19:36:31 -05:00
somebody	6f7e6422ef	Actually get correct primary device	2023-07-03 19:04:48 -05:00
somebody	59c731f805	Fix static primary_device and some small cleanup	2023-07-03 18:37:48 -05:00
Henk	81e72329af	CPU fixes	2023-07-02 21:50:23 +02:00
0cc4m	0e4b6571d5	Fix non-tuple return from gptq function	2023-06-28 22:50:04 +02:00
0cc4m	c753671ac1	Add exllama superhot positional embeddings compression support	2023-06-27 07:39:37 +02:00
Henk	1da4580e8b	Remove wrong usegpu behavior	2023-06-22 07:07:02 +02:00
somebody	5ee20bd7d6	Fix for CPU loading	2023-06-21 21:18:43 -05:00
somebody	b81f61b820	Clean debug	2023-06-21 18:35:56 -05:00
somebody	947bcc58e4	Experiments	2023-06-21 17:33:14 -05:00
somebody	0012158eac	Remove old	2023-06-21 16:58:59 -05:00
somebody	6bdcf2645e	Merge branch 'united' of https://github.com/henk717/KoboldAI into accelerate-offloading	2023-06-21 16:58:39 -05:00
somebody	c40649a74e	Probably fix f32	2023-06-21 16:54:41 -05:00
somebody	aca2b532d7	Remove debug	2023-06-21 14:15:38 -05:00
somebody	5f224e1366	Restore choice of lazyload or not	2023-06-21 14:13:14 -05:00
somebody	0052ad401a	Basic breakmodel ui support Seems to work	2023-06-21 13:57:32 -05:00
Henk	bbecdaeedb	Silently disable MTJ when Jax is not installed	2023-06-21 17:08:45 +02:00
0cc4m	e8741a1b57	Disable scaled_dot_product_attention if torch version < 2	2023-06-20 09:19:43 +02:00
0cc4m	a191855b37	Track token generation progress	2023-06-19 19:14:26 +02:00
0cc4m	e874f0c1c2	Add token streaming support for exllama	2023-06-19 19:14:26 +02:00

1 2 3 4 5

210 Commits