Gnome Ann
|
95aff61781
|
Don't pin CPU layers after running out of pinned memory
|
2021-11-26 10:31:15 -05:00 |
Gnome Ann
|
25c9be5d02
|
Breakmodel support for GPTJModel
|
2021-11-25 18:09:16 -05:00 |
Gnome Ann
|
f8bcc3411b
|
In breakmodel mode, move layers to GPU as soon as model loads
Rather than during the first generation.
|
2021-11-25 11:44:41 -05:00 |
Gnome Ann
|
3649ba9fa4
|
Breakmodel's CUDA stream should be on primary device
|
2021-10-06 12:04:56 -04:00 |
Gnome Ann
|
f9e6a6da17
|
Slightly increased performance in breakmodel mode
Commit a283d34b27 made breakmodel mode
slower. Performance has been restored to how it was before that commit.
|
2021-10-05 10:25:06 -04:00 |
Gnome Ann
|
a283d34b27
|
Multiple GPU support
|
2021-10-05 09:38:57 -04:00 |
Gnome Ann
|
0937bb33e7
|
Clarify licensing for breakmodel.py
|
2021-10-02 12:19:37 -04:00 |
Gnome Ann
|
4d9eab3785
|
K80 test
|
2021-09-23 20:57:18 -04:00 |
Gnome Ann
|
b5c28f4e07
|
Fix for when breakmodel layers is 0
|
2021-08-28 02:19:51 -04:00 |
Gnome Ann
|
8bfcf86a8b
|
Fix for non-rotary models without "rotary" in config.json
|
2021-08-20 13:00:53 -04:00 |
Gnome Ann
|
eef0db8dee
|
Specifically import torch.cuda.comm in breakmodel.py
|
2021-08-20 10:47:54 -04:00 |
Gnome Ann
|
b1c13f832a
|
Implement arrmansa's low VRAM patch
|
2021-08-20 10:25:03 -04:00 |