0cc4m
|
09bb1021dd
|
Fallback to transformers if hf_bleeding_edge not available
|
2023-07-23 07:16:52 +02:00 |
|
0cc4m
|
748e5ef318
|
Add sliders for exllama context size and related methods
|
2023-07-23 07:11:28 +02:00 |
|
0cc4m
|
7516ecf00d
|
Merge upstream changes, fix conflict
|
2023-07-19 07:02:29 +02:00 |
|
0cc4m
|
9aa6c5fbbf
|
Merge upstream changes, fix conflict, adapt backends to changes
|
2023-07-19 06:56:09 +02:00 |
|
Henk
|
22e7baec52
|
Permit CPU layers on 4-bit (Worse than GGML)
|
2023-07-18 21:44:34 +02:00 |
|
Henk
|
5bbcdc47da
|
4-bit on Colab
|
2023-07-18 01:48:01 +02:00 |
|
henk717
|
da9226fba5
|
Merge pull request #401 from ebolam/Model_Plugins
Save the 4-bit flag to the model settings.
|
2023-07-18 01:19:43 +02:00 |
|
somebody
|
1637760fa1
|
Delete basic 4bit
And add code to handle dangling __pycache__s
|
2023-07-17 18:16:03 -05:00 |
|
somebody
|
23b95343bd
|
Patches: Make lazyload work on quantized
i wanna watch youtube while my model is loading without locking up my
system >:(
|
2023-07-17 16:47:31 -05:00 |
|
ebolam
|
b9ee6e336a
|
Save the 4-bit flag to the model settings.
|
2023-07-17 09:50:03 -04:00 |
|
Alephrin
|
145a43a000
|
Removed extra load_in_4bit.
|
2023-07-17 04:53:47 -06:00 |
|
Alephrin
|
e9913d657a
|
Speeds up bnb 4bit with a custom BitsAndBytesConfig
With this BitsAndBytesConfig I get about double the speed compared to running without it. (Tested on llama 13B with a 3090)
|
2023-07-17 04:43:43 -06:00 |
|
Henk
|
8bef2e5fef
|
Fixes 16-bit if BnB is not installed
|
2023-07-16 02:02:58 +02:00 |
|
0cc4m
|
e78361fc8f
|
Pull upstream changes, fix conflicts
|
2023-07-15 23:01:52 +02:00 |
|
Henk
|
0622810bc4
|
Better way of doing the if statement
|
2023-07-15 20:00:29 +02:00 |
|
Henk
|
23a104a4fe
|
Only show 4-bit toggle on valid model
|
2023-07-15 19:42:26 +02:00 |
|
Henk
|
71b6e8d6d4
|
Fix accidental parameters overwrite
|
2023-07-15 19:35:40 +02:00 |
|
Henk
|
c43d60772b
|
BnB dependency check
|
2023-07-15 18:56:13 +02:00 |
|
Henk
|
160effb9ea
|
Add 4-bit BnB toggle
|
2023-07-15 18:20:10 +02:00 |
|
Henk
|
2c50d5d092
|
Don't ruin breakmodel
|
2023-07-15 14:14:06 +02:00 |
|
Henk
|
1f045110a4
|
Basic 4-bit backend
|
2023-07-15 02:49:31 +02:00 |
|
onesome
|
afa8766ea6
|
Add is_valid
|
2023-07-14 18:01:18 -05:00 |
|
somebody
|
f67cb7fa05
|
Make basic hf independant from hf
|
2023-07-12 18:36:30 -05:00 |
|
somebody
|
d17ce8461d
|
Use device_map="auto"
|
2023-07-12 17:27:48 -05:00 |
|
somebody
|
60473d4c23
|
Fix and add some documentation to basic hf backend
|
2023-07-12 17:16:05 -05:00 |
|
onesome
|
8077d6c3f9
|
Self-contained sampler patch (Don't merge)
Completely untested 3:00 AM code; beware! I will test and add more
documentation tomorrow.
|
2023-07-12 03:22:43 -05:00 |
|
somebody
|
20b4b4bcef
|
Add basic hf backend
|
2023-07-08 17:12:16 -05:00 |
|
somebody
|
3928d86339
|
Fall back to unpatched HF
|
2023-07-08 14:36:45 -05:00 |
|
somebody
|
c2ee30af32
|
Add --panic to raise when loading fails
|
2023-07-08 14:04:46 -05:00 |
|
Henk
|
16240878bc
|
Restore --peft support
|
2023-07-04 20:42:29 +02:00 |
|
somebody
|
bce1a907e5
|
Update aux device to depend on primary device
|
2023-07-03 19:36:31 -05:00 |
|
somebody
|
6f7e6422ef
|
Actually get correct primary device
|
2023-07-03 19:04:48 -05:00 |
|
somebody
|
59c731f805
|
Fix static primary_device
and some small cleanup
|
2023-07-03 18:37:48 -05:00 |
|
Henk
|
81e72329af
|
CPU fixes
|
2023-07-02 21:50:23 +02:00 |
|
0cc4m
|
0e4b6571d5
|
Fix non-tuple return from gptq function
|
2023-06-28 22:50:04 +02:00 |
|
0cc4m
|
c753671ac1
|
Add exllama superhot positional embeddings compression support
|
2023-06-27 07:39:37 +02:00 |
|
Henk
|
1da4580e8b
|
Remove wrong usegpu behavior
|
2023-06-22 07:07:02 +02:00 |
|
somebody
|
5ee20bd7d6
|
Fix for CPU loading
|
2023-06-21 21:18:43 -05:00 |
|
somebody
|
b81f61b820
|
Clean debug
|
2023-06-21 18:35:56 -05:00 |
|
somebody
|
947bcc58e4
|
Experiments
|
2023-06-21 17:33:14 -05:00 |
|
somebody
|
0012158eac
|
Remove old
|
2023-06-21 16:58:59 -05:00 |
|
somebody
|
6bdcf2645e
|
Merge branch 'united' of https://github.com/henk717/KoboldAI into accelerate-offloading
|
2023-06-21 16:58:39 -05:00 |
|
somebody
|
c40649a74e
|
Probably fix f32
|
2023-06-21 16:54:41 -05:00 |
|
somebody
|
aca2b532d7
|
Remove debug
|
2023-06-21 14:15:38 -05:00 |
|
somebody
|
5f224e1366
|
Restore choice of lazyload or not
|
2023-06-21 14:13:14 -05:00 |
|
somebody
|
0052ad401a
|
Basic breakmodel ui support
Seems to work
|
2023-06-21 13:57:32 -05:00 |
|
Henk
|
bbecdaeedb
|
Silently disable MTJ when Jax is not installed
|
2023-06-21 17:08:45 +02:00 |
|
0cc4m
|
e8741a1b57
|
Disable scaled_dot_product_attention if torch version < 2
|
2023-06-20 09:19:43 +02:00 |
|
0cc4m
|
a191855b37
|
Track token generation progress
|
2023-06-19 19:14:26 +02:00 |
|
0cc4m
|
e874f0c1c2
|
Add token streaming support for exllama
|
2023-06-19 19:14:26 +02:00 |
|