Commit Graph

12 Commits

Author SHA1 Message Date
0cc4m
748e5ef318 Add sliders for exllama context size and related methods 2023-07-23 07:11:28 +02:00
0cc4m
9aa6c5fbbf Merge upstream changes, fix conflict, adapt backends to changes 2023-07-19 06:56:09 +02:00
0cc4m
0e4b6571d5 Fix non-tuple return from gptq function 2023-06-28 22:50:04 +02:00
0cc4m
c753671ac1 Add exllama superhot positional embeddings compression support 2023-06-27 07:39:37 +02:00
0cc4m
e8741a1b57 Disable scaled_dot_product_attention if torch version < 2 2023-06-20 09:19:43 +02:00
0cc4m
a191855b37 Track token generation progress 2023-06-19 19:14:26 +02:00
0cc4m
e874f0c1c2 Add token streaming support for exllama 2023-06-19 19:14:26 +02:00
0cc4m
0c7eaefb1a Fix AMD ROCm exllama inference 2023-06-13 10:11:29 +02:00
0cc4m
47b371b9d3 Fix multigpu 2023-06-06 19:51:38 +02:00
0cc4m
39dfb18455 Replace exllama samplers with kobold's inbuilt ones 2023-06-06 19:21:34 +02:00
0cc4m
94520d5c80 Fix exllama model unload 2023-06-05 18:43:57 +02:00
0cc4m
b35f61e987 Basic exllama plugin 2023-06-04 15:40:12 +02:00