biscober
35f908e147
Update install_requirements.bat ( #7 )
...
* Update install_requirements.bat
move command to dismount temp B drive to after pip install command which requires B drive to still be mounted
* Update install_requirements.bat
cmd /k not necessary
* Update install_requirements.bat
add quotes (probably not required but w/e)
2023-04-11 04:37:48 +02:00
0cc4m
687d107d20
Update README, remove steps that are no longer required
2023-04-10 22:46:12 +02:00
0cc4m
b628aec719
Automatic installation of the quant_cuda module during install_requirements
...
Kepler (K40+) and Maxwell support
2023-04-10 22:37:16 +02:00
0cc4m
7efd314428
Improve guide
2023-04-07 20:10:24 +02:00
0cc4m
ccf34a5edc
Fix merge issues with upstream, merge changes
2023-04-07 19:51:07 +02:00
0cc4m
636c4e5a52
Update gptq repo
2023-04-07 11:48:57 +02:00
0cc4m
40092cc9fa
Improve guide formatting
2023-04-05 21:49:13 +02:00
0cc4m
8b4375307c
Update file formatting section in guide
2023-04-05 21:10:40 +02:00
0cc4m
e4f8a9344c
Merge pull request #1 from Digitous/patch-1
...
Add install instructions
2023-04-05 21:08:14 +02:00
Henk
80e4b9e536
Merge branch 'main' into united
2023-04-05 00:22:30 +02:00
Henk
4b71da1714
Horde settings in the UI
2023-04-04 17:20:43 +02:00
0cc4m
ce6761e744
Fix issue causing expected scalar type Float but found Half RuntimeErrors
2023-04-04 07:46:53 +02:00
Henk
8bf533da9a
Pin Accelerate Version
2023-04-04 01:47:59 +02:00
0cc4m
b9df9b6f59
Improve CPU offloading speed significantly when offloading less than half of the layers
2023-04-03 20:27:17 +02:00
0cc4m
5abdecad2c
Merge pull request #5 from 0cc4m/cpu-offload-1
...
CPU Offloading Support
2023-04-03 06:52:48 +02:00
0cc4m
ec4177a6d6
Remove cudatoolkit-dev and gcc/gxx 9 from conda env because they didn't resolve on Windows
2023-04-03 06:50:36 +02:00
0cc4m
c8d00b7a10
Add CPU offloading support for GPT-NeoX, GPT-J and OPT
2023-04-02 18:36:31 +02:00
0cc4m
e742083703
Fix multi-gpu-offloading
2023-04-02 11:17:29 +02:00
0cc4m
2729b77640
Add offload.py adapted from llama_inference_offload.py, with multi-gpu support and some improvements. Not yet functional, and still just supports Llama
2023-04-02 10:32:19 +02:00
Henk
4a8b099888
Model loading fix
2023-04-02 00:29:56 +02:00
0cc4m
110f8229c5
Add cudatoolkit-dev for compilation, compatible gcc 9 and update transformers to fix error in gptq
2023-04-01 21:33:05 +02:00
0cc4m
bf0c999412
Update GPTQ to support AMD
2023-04-01 14:19:51 +02:00
0cc4m
d3a5ca6505
Update gptq submodule to latest
2023-04-01 08:52:08 +00:00
0cc4m
6eae457479
Fix 4bit groupsize param letter
...
Use g instead of b for groupsize name, for example 4bit-128g.safetensors
2023-03-31 15:36:03 +02:00
henk717
72b4669563
Fix the chex dependency
2023-03-30 23:41:35 +02:00
0cc4m
aa2292b3a4
Enable multi-gpu support
2023-03-30 19:40:49 +02:00
0cc4m
61b13604b6
Fix bug in 4-bit load fallback
2023-03-30 10:57:04 +02:00
henk717
943d0fe68a
Merge branch 'KoboldAI:main' into united
2023-03-30 00:51:17 +02:00
henk717
ab779efe0e
Merge pull request #276 from YellowRoseCx/stable-branch
...
Update README and remove unavailable model from gpu.ipynb
2023-03-30 00:50:15 +02:00
YellowRoseCx
3c48a77a52
Update README.md
...
changed Colab GPU models listed to their higher quality counter parts
2023-03-29 17:44:44 -05:00
YellowRoseCx
f826930c02
Update GPU.ipynb
...
removed litv2-6B-rev3
2023-03-29 17:41:01 -05:00
0cc4m
9d0477f5f7
Fix bug where it picks old model despite new one available
2023-03-29 22:05:44 +00:00
0cc4m
73d5ec0e5d
Pull latest gptq-changes
2023-03-29 20:07:26 +00:00
0cc4m
a0bc770426
Add basic groupsize support
...
Write groupsize into filename, for example 4bit-128b.safetensors for groupsize 128
2023-03-29 19:49:05 +00:00
0cc4m
f6f7687cc0
Add 4bit safetensor support, improve loading code
2023-03-29 14:47:59 +00:00
0cc4m
8d008b87a6
Add OPT support
2023-03-29 13:27:11 +00:00
Digitous
e698f22706
Update README.md
2023-03-28 19:14:46 -04:00
0cc4m
ef6fe680a9
Fix high VRAM usage caused by workaround for scalar type error
2023-03-28 06:30:02 +00:00
henk717
66264d38c4
Add Mixes
2023-03-28 00:23:10 +02:00
0cc4m
0f1fc46078
Fix errors during inference
2023-03-27 21:30:43 +00:00
0cc4m
d1a2005a27
Add support for old and new 4-bit format. Old one needs 4bit-old.pt file to launch
2023-03-27 20:45:21 +00:00
0cc4m
2e7a8a1a66
Adapt KoboldAI to latest gptq changes
2023-03-27 04:48:21 +00:00
henk717
37c3fd00b9
Merge pull request #315 from jojorne/jojorne-patch-enable-renaming-deleting-wi-root-folder
...
Enable renaming/deleting wi root folder by creating a new one
2023-03-26 16:30:41 +02:00
henk717
bbb554efd3
Merge branch 'KoboldAI:main' into united
2023-03-26 01:45:52 +01:00
0cc4m
9dcba38978
Pin transformers to a working Llama-compatible version
2023-03-24 19:07:28 +00:00
0cc4m
026eb3205e
Fix 4-bit loading error when not loading in 4-bit
2023-03-22 22:12:06 +00:00
0cc4m
8941428c66
Fix Kobold loading to CPU in 4-bit, causing CUDA ASSERT error
2023-03-22 06:22:34 +00:00
0cc4m
c7edc764b9
Fix llama loading
2023-03-21 21:58:31 +00:00
0cc4m
ecd065a881
Overhaul 4-bit support to load with a toggle
2023-03-21 21:40:59 +00:00
0cc4m
4cfc1219d4
Add gptq as submodule
2023-03-20 19:13:46 +00:00