0cc4m
|
748e5ef318
|
Add sliders for exllama context size and related methods
|
2023-07-23 07:11:28 +02:00 |
|
0cc4m
|
9aa6c5fbbf
|
Merge upstream changes, fix conflict, adapt backends to changes
|
2023-07-19 06:56:09 +02:00 |
|
0cc4m
|
0e4b6571d5
|
Fix non-tuple return from gptq function
|
2023-06-28 22:50:04 +02:00 |
|
0cc4m
|
c753671ac1
|
Add exllama superhot positional embeddings compression support
|
2023-06-27 07:39:37 +02:00 |
|
0cc4m
|
e8741a1b57
|
Disable scaled_dot_product_attention if torch version < 2
|
2023-06-20 09:19:43 +02:00 |
|
0cc4m
|
a191855b37
|
Track token generation progress
|
2023-06-19 19:14:26 +02:00 |
|
0cc4m
|
e874f0c1c2
|
Add token streaming support for exllama
|
2023-06-19 19:14:26 +02:00 |
|
0cc4m
|
0c7eaefb1a
|
Fix AMD ROCm exllama inference
|
2023-06-13 10:11:29 +02:00 |
|
0cc4m
|
47b371b9d3
|
Fix multigpu
|
2023-06-06 19:51:38 +02:00 |
|
0cc4m
|
39dfb18455
|
Replace exllama samplers with kobold's inbuilt ones
|
2023-06-06 19:21:34 +02:00 |
|
0cc4m
|
94520d5c80
|
Fix exllama model unload
|
2023-06-05 18:43:57 +02:00 |
|
0cc4m
|
b35f61e987
|
Basic exllama plugin
|
2023-06-04 15:40:12 +02:00 |
|