KoboldAI-Client

Author	SHA1	Message	Date
Gnome Ann	95aff61781	Don't pin CPU layers after running out of pinned memory	2021-11-26 10:31:15 -05:00
Gnome Ann	25c9be5d02	Breakmodel support for GPTJModel	2021-11-25 18:09:16 -05:00
Gnome Ann	f8bcc3411b	In breakmodel mode, move layers to GPU as soon as model loads Rather than during the first generation.	2021-11-25 11:44:41 -05:00
Gnome Ann	3649ba9fa4	Breakmodel's CUDA stream should be on primary device	2021-10-06 12:04:56 -04:00
Gnome Ann	f9e6a6da17	Slightly increased performance in breakmodel mode Commit `a283d34b27` made breakmodel mode slower. Performance has been restored to how it was before that commit.	2021-10-05 10:25:06 -04:00
Gnome Ann	a283d34b27	Multiple GPU support	2021-10-05 09:38:57 -04:00
Gnome Ann	0937bb33e7	Clarify licensing for breakmodel.py	2021-10-02 12:19:37 -04:00
Gnome Ann	4d9eab3785	K80 test	2021-09-23 20:57:18 -04:00
Gnome Ann	b5c28f4e07	Fix for when breakmodel layers is 0	2021-08-28 02:19:51 -04:00
Gnome Ann	8bfcf86a8b	Fix for non-rotary models without "rotary" in config.json	2021-08-20 13:00:53 -04:00
Gnome Ann	eef0db8dee	Specifically import torch.cuda.comm in breakmodel.py	2021-08-20 10:47:54 -04:00
Gnome Ann	b1c13f832a	Implement arrmansa's low VRAM patch	2021-08-20 10:25:03 -04:00