Commit Graph

4399 Commits

Author SHA1 Message Date
0cc4m
49740aa5ab Fix ntk alpha 2023-07-23 21:56:48 +02:00
0cc4m
31a984aa3d Automatically install exllama module 2023-07-23 07:33:51 +02:00
0cc4m
a9aa04fd1b Merge remote-tracking branch 'upstream/united' into 4bit-plugin 2023-07-23 07:18:58 +02:00
0cc4m
09bb1021dd Fallback to transformers if hf_bleeding_edge not available 2023-07-23 07:16:52 +02:00
0cc4m
748e5ef318 Add sliders for exllama context size and related methods 2023-07-23 07:11:28 +02:00
Henk
7a5d813b92 Reimplement HF workaround only for llama 2023-07-22 16:59:49 +02:00
Henk
8dd7b93a6c HF's workaround breaks stuff 2023-07-22 16:29:55 +02:00
Henk
fa9d17b3d3 HF 4.31 2023-07-22 15:25:14 +02:00
Henk
7823da564e Link to Lite 2023-07-22 04:04:17 +02:00
henk717
83e5c29260 Merge pull request #413 from one-some/bug-hunt
Fix WI comment editing
2023-07-22 00:34:46 +02:00
somebody
e68972a270 Fix WI comments 2023-07-21 16:14:13 -05:00
Henk
a17d7aae60 Easier english 2023-07-21 19:42:49 +02:00
Henk
da9b54ec1c Don't show API link during load 2023-07-21 19:31:38 +02:00
Henk
432cdc9a08 Fix models with good pad tokens 2023-07-21 16:39:58 +02:00
Henk
ec745d8b80 Dont accidentally block pad tokens 2023-07-21 16:25:32 +02:00
henk717
dc4404f29c Merge pull request #409 from nkpz/bnb8bit
Configurable quantization level, fix for broken toggles in model settings
2023-07-19 14:22:44 +02:00
Nick Perez
9581e51476 feature(load model): select control for quantization level 2023-07-19 07:58:12 -04:00
0cc4m
58908ab846 Revert aiserver.py changes 2023-07-19 07:14:03 +02:00
0cc4m
19f511dc9f Load GPTQ module from GPTQ repo docs 2023-07-19 07:12:37 +02:00
0cc4m
1c5da2bbf3 Move pip docs from KoboldAI into GPTQ repo 2023-07-19 07:08:39 +02:00
0cc4m
7516ecf00d Merge upstream changes, fix conflict 2023-07-19 07:02:29 +02:00
0cc4m
c84d063be8 Revert settings changes 2023-07-19 07:01:11 +02:00
0cc4m
9aa6c5fbbf Merge upstream changes, fix conflict, adapt backends to changes 2023-07-19 06:56:09 +02:00
Nick Perez
0142913060 8 bit toggle, fix for broken toggle values 2023-07-18 23:29:38 -04:00
Henk
22e7baec52 Permit CPU layers on 4-bit (Worse than GGML) 2023-07-18 21:44:34 +02:00
henk717
5f2600d338 Merge pull request #406 from ebolam/Model_Plugins
Clarified message on what's required for model backend parameters
2023-07-18 02:42:23 +02:00
ebolam
66192efdb7 Clarified message on what's required for model backend parameters in the command line 2023-07-17 20:30:41 -04:00
Henk
5bbcdc47da 4-bit on Colab 2023-07-18 01:48:01 +02:00
henk717
da9226fba5 Merge pull request #401 from ebolam/Model_Plugins
Save the 4-bit flag to the model settings.
2023-07-18 01:19:43 +02:00
henk717
fee79928c8 Merge pull request #404 from one-some/united
Delete basic 4bit
2023-07-18 01:19:14 +02:00
somebody
1637760fa1 Delete basic 4bit
And add code to handle dangling __pycache__s
2023-07-17 18:16:03 -05:00
henk717
5c3a8e295a Merge pull request #402 from one-some/united
Patches: Make lazyload work with quantization
2023-07-17 23:53:14 +02:00
somebody
23b95343bd Patches: Make lazyload work on quantized
i wanna watch youtube while my model is loading without locking up my
system >:(
2023-07-17 16:47:31 -05:00
ebolam
4acf9235db Merge branch 'Model_Plugins' of https://github.com/ebolam/KoboldAI into Model_Plugins 2023-07-17 09:52:10 -04:00
ebolam
b9ee6e336a Save the 4-bit flag to the model settings. 2023-07-17 09:50:03 -04:00
ebolam
66377fc09e Save the 4-bit flag to the model settings. 2023-07-17 09:48:01 -04:00
henk717
e8d84bb787 Merge pull request #400 from ebolam/Model_Plugins
missed the elif
2023-07-17 15:16:34 +02:00
ebolam
eafb699bbf missed the elif 2023-07-17 09:12:45 -04:00
henk717
a3b0c6dd60 Merge pull request #399 from ebolam/Model_Plugins
Update to the upload_file function
2023-07-17 15:11:40 +02:00
ebolam
bfb26ab55d Ban uploading to the modeling directory 2023-07-17 09:05:22 -04:00
ebolam
52e061d0f9 Fix for potential jailbreak 2023-07-17 08:55:23 -04:00
henk717
f7561044c6 Merge pull request #398 from Alephrin/patch-1
Speeds up bnb 4bit with a custom BitsAndBytesConfig
2023-07-17 13:22:44 +02:00
Alephrin
145a43a000 Removed extra load_in_4bit. 2023-07-17 04:53:47 -06:00
Alephrin
e9913d657a Speeds up bnb 4bit with a custom BitsAndBytesConfig
With this BitsAndBytesConfig I get about double the speed compared to running without it. (Tested on llama 13B with a 3090)
2023-07-17 04:43:43 -06:00
Henk
6d7e9e6771 Post4 BnB for Linux 2023-07-16 02:13:42 +02:00
Henk
8bef2e5fef Fixes 16-bit if BnB is not installed 2023-07-16 02:02:58 +02:00
henk717
fac006125e Merge pull request #397 from ebolam/Model_Plugins
Fixes for model backend UI
2023-07-15 23:58:24 +02:00
0cc4m
e78361fc8f Pull upstream changes, fix conflicts 2023-07-15 23:01:52 +02:00
0cc4m
ed7ad00b59 Move GPTQ readme changes to separate file 2023-07-15 22:55:17 +02:00
ebolam
869bcadd03 Fix for toggles showing as check boxes in model loading
Fix for resubmit_model_info loosing selected model backend
2023-07-15 15:48:31 -04:00