Henk
|
0d0a671bb9
|
Better use_cache implementation
|
2023-09-07 04:29:28 +02:00 |
|
Henk
|
290f2ce05e
|
CPU only warning
|
2023-08-26 00:03:28 +02:00 |
|
Henk
|
3dd0e91fbb
|
Preliminary HF GPTQ changes
|
2023-08-21 01:58:52 +02:00 |
|
somebody
|
b9da974eb7
|
GenericHFTorch: Change use_4_bit to quantization in __init__
|
2023-08-14 00:56:40 -05:00 |
|
somebody
|
906d1f2522
|
Merge branch 'united' of https://github.com/henk717/KoboldAI into fixing-time
|
2023-08-07 16:22:04 -05:00 |
|
Henk
|
30495cf8d8
|
Fix GPT2
|
2023-07-24 02:05:07 +02:00 |
|
Henk
|
a963c97acb
|
Make 4-bit the default part 2
|
2023-07-24 00:06:20 +02:00 |
|
Henk
|
0f913275a9
|
4-bit as Default
|
2023-07-23 23:08:11 +02:00 |
|
0cc4m
|
a9aa04fd1b
|
Merge remote-tracking branch 'upstream/united' into 4bit-plugin
|
2023-07-23 07:18:58 +02:00 |
|
0cc4m
|
09bb1021dd
|
Fallback to transformers if hf_bleeding_edge not available
|
2023-07-23 07:16:52 +02:00 |
|
Henk
|
7a5d813b92
|
Reimplement HF workaround only for llama
|
2023-07-22 16:59:49 +02:00 |
|
Henk
|
8dd7b93a6c
|
HF's workaround breaks stuff
|
2023-07-22 16:29:55 +02:00 |
|
Henk
|
fa9d17b3d3
|
HF 4.31
|
2023-07-22 15:25:14 +02:00 |
|
somebody
|
fef42a6273
|
API: Fix loading
|
2023-07-19 11:52:39 -05:00 |
|
Nick Perez
|
9581e51476
|
feature(load model): select control for quantization level
|
2023-07-19 07:58:12 -04:00 |
|
0cc4m
|
7516ecf00d
|
Merge upstream changes, fix conflict
|
2023-07-19 07:02:29 +02:00 |
|
Nick Perez
|
0142913060
|
8 bit toggle, fix for broken toggle values
|
2023-07-18 23:29:38 -04:00 |
|
Henk
|
22e7baec52
|
Permit CPU layers on 4-bit (Worse than GGML)
|
2023-07-18 21:44:34 +02:00 |
|
Henk
|
5bbcdc47da
|
4-bit on Colab
|
2023-07-18 01:48:01 +02:00 |
|
henk717
|
da9226fba5
|
Merge pull request #401 from ebolam/Model_Plugins
Save the 4-bit flag to the model settings.
|
2023-07-18 01:19:43 +02:00 |
|
somebody
|
23b95343bd
|
Patches: Make lazyload work on quantized
i wanna watch youtube while my model is loading without locking up my
system >:(
|
2023-07-17 16:47:31 -05:00 |
|
ebolam
|
b9ee6e336a
|
Save the 4-bit flag to the model settings.
|
2023-07-17 09:50:03 -04:00 |
|
Alephrin
|
145a43a000
|
Removed extra load_in_4bit.
|
2023-07-17 04:53:47 -06:00 |
|
Alephrin
|
e9913d657a
|
Speeds up bnb 4bit with a custom BitsAndBytesConfig
With this BitsAndBytesConfig I get about double the speed compared to running without it. (Tested on llama 13B with a 3090)
|
2023-07-17 04:43:43 -06:00 |
|
Henk
|
8bef2e5fef
|
Fixes 16-bit if BnB is not installed
|
2023-07-16 02:02:58 +02:00 |
|
0cc4m
|
e78361fc8f
|
Pull upstream changes, fix conflicts
|
2023-07-15 23:01:52 +02:00 |
|
Henk
|
0622810bc4
|
Better way of doing the if statement
|
2023-07-15 20:00:29 +02:00 |
|
Henk
|
23a104a4fe
|
Only show 4-bit toggle on valid model
|
2023-07-15 19:42:26 +02:00 |
|
Henk
|
71b6e8d6d4
|
Fix accidental parameters overwrite
|
2023-07-15 19:35:40 +02:00 |
|
Henk
|
c43d60772b
|
BnB dependency check
|
2023-07-15 18:56:13 +02:00 |
|
Henk
|
160effb9ea
|
Add 4-bit BnB toggle
|
2023-07-15 18:20:10 +02:00 |
|
somebody
|
c2ee30af32
|
Add --panic to raise when loading fails
|
2023-07-08 14:04:46 -05:00 |
|
somebody
|
0012158eac
|
Remove old
|
2023-06-21 16:58:59 -05:00 |
|
somebody
|
6bdcf2645e
|
Merge branch 'united' of https://github.com/henk717/KoboldAI into accelerate-offloading
|
2023-06-21 16:58:39 -05:00 |
|
somebody
|
0052ad401a
|
Basic breakmodel ui support
Seems to work
|
2023-06-21 13:57:32 -05:00 |
|
0cc4m
|
05a0bfe6c4
|
Don't show HF support if no HF model files are found
|
2023-06-04 09:44:28 +02:00 |
|
0cc4m
|
eace95cc72
|
Pull upstream changes, fix conflict
|
2023-06-04 09:06:31 +02:00 |
|
ebolam
|
339f501600
|
Added model_backend_type to allow the current menu to specify a class of backends rather than a specific backend.
Added super basic hf backend (testing phase only)
|
2023-06-02 16:11:40 -04:00 |
|
ebolam
|
5c4d580aac
|
Fix for --nobreakmodel forcing CPU
Put importing of colab packages into a if function so it doesn't error out
|
2023-06-02 12:58:59 -04:00 |
|
somebody
|
24b0b32829
|
Maybe works now...?
|
2023-05-31 14:31:08 -05:00 |
|
0cc4m
|
e49d35afc9
|
Add 4bit plugin
|
2023-05-28 22:54:36 +02:00 |
|
somebody
|
ceaefa9f5e
|
Not quite
|
2023-05-28 14:57:45 -05:00 |
|
somebody
|
ed0728188a
|
More cleaning
|
2023-05-28 13:22:32 -05:00 |
|
somebody
|
6f93150e4d
|
Work on lazyload
|
2023-05-28 12:25:31 -05:00 |
|
0cc4m
|
d71a63fa49
|
Merge ebolam's model-plugins branch
|
2023-05-28 09:26:13 +02:00 |
|
somebody
|
1546b9efaa
|
Hello its breaking breakmodel time
|
2023-05-27 16:31:53 -05:00 |
|
ebolam
|
cce5c1932c
|
Fix for custom model names
|
2023-05-26 21:40:39 -04:00 |
|
ebolam
|
9bd445c2a8
|
gpt2 fixed
|
2023-05-23 20:33:55 -04:00 |
|
ebolam
|
9e53bcf676
|
Fix for breakmodel loading to CPU when set to GPU
|
2023-05-22 20:24:57 -04:00 |
|
ebolam
|
06f59a7b7b
|
Moved model backends to separate folders
added some model backend settings save/load
|
2023-05-18 20:14:33 -04:00 |
|