0cc4m
|
687d107d20
|
Update README, remove steps that are no longer required
|
2023-04-10 22:46:12 +02:00 |
|
0cc4m
|
b628aec719
|
Automatic installation of the quant_cuda module during install_requirements
Kepler (K40+) and Maxwell support
|
2023-04-10 22:37:16 +02:00 |
|
henk717
|
2385a34098
|
Merge pull request #325 from YellowRoseCx/patch-1
Add IP Whitelisting to --host
|
2023-04-10 14:08:09 +02:00 |
|
somebody
|
334c09606b
|
Fix for tokenizer stuff on pythia
|
2023-04-09 18:23:58 -05:00 |
|
somebody
|
3e8e3a18b0
|
Fix for custom gpt2
|
2023-04-09 18:23:52 -05:00 |
|
somebody
|
f73a8bb808
|
Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv
|
2023-04-09 14:38:09 -05:00 |
|
somebody
|
fedbffd07b
|
Small fixes
Typos galore!
|
2023-04-09 13:35:28 -05:00 |
|
0cc4m
|
7efd314428
|
Improve guide
|
2023-04-07 20:10:24 +02:00 |
|
0cc4m
|
ccf34a5edc
|
Fix merge issues with upstream, merge changes
|
2023-04-07 19:51:07 +02:00 |
|
0cc4m
|
636c4e5a52
|
Update gptq repo
|
2023-04-07 11:48:57 +02:00 |
|
YellowRoseCx
|
ac98cd6dd1
|
add IP_whitelisting to koboldai_settings.py
|
2023-04-05 21:27:59 -05:00 |
|
YellowRoseCx
|
71e5d23a5b
|
Add IP whitelisting to --host
|
2023-04-05 21:23:24 -05:00 |
|
0cc4m
|
40092cc9fa
|
Improve guide formatting
|
2023-04-05 21:49:13 +02:00 |
|
0cc4m
|
8b4375307c
|
Update file formatting section in guide
|
2023-04-05 21:10:40 +02:00 |
|
0cc4m
|
e4f8a9344c
|
Merge pull request #1 from Digitous/patch-1
Add install instructions
|
2023-04-05 21:08:14 +02:00 |
|
Henk
|
80e4b9e536
|
Merge branch 'main' into united
|
2023-04-05 00:22:30 +02:00 |
|
henk717
|
29c2d4b7a6
|
Removing Pygmalion from the TPU colab to get it unbanned
|
2023-04-04 19:51:18 +02:00 |
|
henk717
|
fd12214091
|
Clean the description of the GPU colab
|
2023-04-04 19:40:22 +02:00 |
|
henk717
|
bb51127bbf
|
We no longer support Pygmalion on Colab due to Google's Pygmalion ban
|
2023-04-04 19:37:15 +02:00 |
|
Henk
|
4b71da1714
|
Horde settings in the UI
|
2023-04-04 17:20:43 +02:00 |
|
0cc4m
|
ce6761e744
|
Fix issue causing expected scalar type Float but found Half RuntimeErrors
|
2023-04-04 07:46:53 +02:00 |
|
Henk
|
8bf533da9a
|
Pin Accelerate Version
|
2023-04-04 01:47:59 +02:00 |
|
somebody
|
8412f83ce5
|
Breakmodel: Fix typo
|
2023-04-03 18:41:18 -05:00 |
|
0cc4m
|
b9df9b6f59
|
Improve CPU offloading speed significantly when offloading less than half of the layers
|
2023-04-03 20:27:17 +02:00 |
|
0cc4m
|
5abdecad2c
|
Merge pull request #5 from 0cc4m/cpu-offload-1
CPU Offloading Support
|
2023-04-03 06:52:48 +02:00 |
|
0cc4m
|
ec4177a6d6
|
Remove cudatoolkit-dev and gcc/gxx 9 from conda env because they didn't resolve on Windows
|
2023-04-03 06:50:36 +02:00 |
|
somebody
|
4230fe4229
|
Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv
|
2023-04-02 16:41:21 -05:00 |
|
somebody
|
77f0797b1a
|
Model fix
|
2023-04-02 15:47:52 -05:00 |
|
somebody
|
9d70646e4d
|
Lazyload: Safetensors
|
2023-04-02 15:40:34 -05:00 |
|
0cc4m
|
c8d00b7a10
|
Add CPU offloading support for GPT-NeoX, GPT-J and OPT
|
2023-04-02 18:36:31 +02:00 |
|
0cc4m
|
e742083703
|
Fix multi-gpu-offloading
|
2023-04-02 11:17:29 +02:00 |
|
0cc4m
|
2729b77640
|
Add offload.py adapted from llama_inference_offload.py, with multi-gpu support and some improvements. Not yet functional, and still just supports Llama
|
2023-04-02 10:32:19 +02:00 |
|
Henk
|
4a8b099888
|
Model loading fix
|
2023-04-02 00:29:56 +02:00 |
|
0cc4m
|
110f8229c5
|
Add cudatoolkit-dev for compilation, compatible gcc 9 and update transformers to fix error in gptq
|
2023-04-01 21:33:05 +02:00 |
|
0cc4m
|
bf0c999412
|
Update GPTQ to support AMD
|
2023-04-01 14:19:51 +02:00 |
|
0cc4m
|
d3a5ca6505
|
Update gptq submodule to latest
|
2023-04-01 08:52:08 +00:00 |
|
0cc4m
|
6eae457479
|
Fix 4bit groupsize param letter
Use g instead of b for groupsize name, for example 4bit-128g.safetensors
|
2023-03-31 15:36:03 +02:00 |
|
henk717
|
72b4669563
|
Fix the chex dependency
|
2023-03-30 23:41:35 +02:00 |
|
0cc4m
|
aa2292b3a4
|
Enable multi-gpu support
|
2023-03-30 19:40:49 +02:00 |
|
0cc4m
|
61b13604b6
|
Fix bug in 4-bit load fallback
|
2023-03-30 10:57:04 +02:00 |
|
henk717
|
943d0fe68a
|
Merge branch 'KoboldAI:main' into united
|
2023-03-30 00:51:17 +02:00 |
|
henk717
|
ab779efe0e
|
Merge pull request #276 from YellowRoseCx/stable-branch
Update README and remove unavailable model from gpu.ipynb
|
2023-03-30 00:50:15 +02:00 |
|
YellowRoseCx
|
3c48a77a52
|
Update README.md
changed Colab GPU models listed to their higher quality counter parts
|
2023-03-29 17:44:44 -05:00 |
|
YellowRoseCx
|
f826930c02
|
Update GPU.ipynb
removed litv2-6B-rev3
|
2023-03-29 17:41:01 -05:00 |
|
0cc4m
|
9d0477f5f7
|
Fix bug where it picks old model despite new one available
|
2023-03-29 22:05:44 +00:00 |
|
0cc4m
|
73d5ec0e5d
|
Pull latest gptq-changes
|
2023-03-29 20:07:26 +00:00 |
|
0cc4m
|
a0bc770426
|
Add basic groupsize support
Write groupsize into filename, for example 4bit-128b.safetensors for groupsize 128
|
2023-03-29 19:49:05 +00:00 |
|
0cc4m
|
f6f7687cc0
|
Add 4bit safetensor support, improve loading code
|
2023-03-29 14:47:59 +00:00 |
|
0cc4m
|
8d008b87a6
|
Add OPT support
|
2023-03-29 13:27:11 +00:00 |
|
Digitous
|
e698f22706
|
Update README.md
|
2023-03-28 19:14:46 -04:00 |
|