Llama
2c48e05f7c
Add exllama dependency back to requirements.
2023-08-28 09:52:31 -07:00
Llama
6151cbd053
Merge branch 'united' into merge/united-exllama
2023-08-28 09:32:19 -07:00
Llama
5229987ab7
Merge pull request #66 from pi6am/feat/exllama-config
...
Modify exllama to load unrenamed gptq quantized models
2023-08-28 00:09:50 -07:00
Llama
554af7b175
Modify exllama to load unrenamed gptq quantized models
...
Read config.json and enable exllama loading if the model has a
`quantization_config` with `quant_methdod` of `gptq`. Note that this
implementation is limited and only supports model.safetensors.
That said, this supports loading popular gptq quantized models
without renaming or symlinking the model file.
2023-08-27 23:56:02 -07:00
Llama
812df5ea56
Merge pull request #65 from pi6am/feat/exllama-badwords
...
Add the eos token to exllama bad words.
2023-08-27 17:03:25 -07:00
Llama
08ff7c138c
Add the eos token to exllama bad words.
...
The bos token was already hardcoded as a bad word id.
Store badwords in a list and iterate over them during generation.
Add the Llama eos token to the list of bad words.
Also support "single line mode", which adds newline (13) to badwords.
2023-08-27 16:34:52 -07:00
Llama
0d150e412e
Merge pull request #64 from pi6am/fix/multinomial-workaround
...
Resample to work around a bug in torch.multinomial
2023-08-26 22:42:21 -07:00
Llama
b7e38b4757
Resample to work around a bug in torch.multinomial
...
There is a bug in PyTorch 2.0.1 that allows torch.multinomial to
sometimes choose elements that have zero probability. Since
this is uncommon we can continue to use torch.multinomial as
long as we verify that the results are valid. If they aren't,
try again until the probability of each selected token is positive.
2023-08-26 22:26:26 -07:00
Llama
b1895de518
Merge pull request #63 from pi6am/feat/exllama-stoppers
...
Add stopper hooks suppport to exllama
2023-08-22 23:14:00 -07:00
Llama
b96d5d8646
Add stopper hooks suppport to exllama
2023-08-22 23:06:16 -07:00
0cc4m
22fd49937a
Merge pull request #62 from pi6am/fix/exllama-eos-space
...
Strip the eos token from exllama generations.
2023-08-22 08:06:02 +02:00
Llama
070cfd339a
Strip the eos token from exllama generations.
...
The end-of-sequence (</s>) token indicates the end of a generation.
When a token sequence containing </s> is decoded, an extra (wrong)
space is inserted at the beginning of the generation. To avoid this,
strip the eos token out of the result before returning it.
The eos token was getting stripped later, so this doesn't change
the output except to avoid the spurious leading space.
2023-08-19 17:40:23 -07:00
Llama
dda5acd5d5
Merge pull request #33 from henk717/united
...
Merge united.
2023-08-18 13:19:24 -07:00
Henk
87934ee393
AutoGPTQ Exllama compile
2023-08-18 21:49:05 +02:00
henk717
3f28503b87
Update huggingface.yml
2023-08-15 22:19:18 +02:00
henk717
1c65528dbf
Fix BnB on Colab
2023-08-14 03:28:29 +02:00
Llama
d8d9890f46
Merge pull request #32 from henk717/united
...
Merge united
2023-08-13 14:10:35 -07:00
Henk
e90903946d
AutoGPTQ updates
2023-08-13 17:36:17 +02:00
Henk
116a88b46c
Better stable diffusion
2023-08-13 16:45:31 +02:00
henk717
89a805a0cc
Merge pull request #411 from one-some/fixing-time
...
Fix most of the API
2023-08-13 13:43:13 +02:00
henk717
dae9a6eb5a
Merge branch 'KoboldAI:main' into united
2023-08-12 02:18:11 +02:00
henk717
ee93fe6e4a
Add model cleaner
2023-08-11 22:39:49 +02:00
Henk
1e87c05e68
Fix discord link
2023-08-11 17:36:41 +02:00
henk717
a9dbe2837e
Merge branch 'KoboldAI:main' into united
2023-08-11 00:36:09 +02:00
henk717
9cb93d6b4c
Add some 13B's for easier beta testing
2023-08-10 23:56:44 +02:00
henk717
2938a9993a
Merge pull request #434 from one-some/united
...
UI: Change mobile aspect ratio threshold from 7/5 to 5/6
2023-08-10 19:57:56 +02:00
henk717
5f1be7c482
Merge pull request #435 from one-some/token-stream-newline-fix
...
UI: Fix token streaming gobbling trailing whitespace
2023-08-10 19:55:50 +02:00
Henk
2628726e1c
Dont use exllama on fail
2023-08-10 19:34:08 +02:00
Henk
9c7ebe3b04
Better AutoGPTQ fallback
2023-08-10 18:10:48 +02:00
Henk
f2d7ef3aca
AutoGPTQ breakmodel
2023-08-10 17:41:31 +02:00
Henk
54addfc234
AutoGPTQ fallback
2023-08-10 17:18:53 +02:00
Henk
1b253ce95f
4-bit dependency fixes
2023-08-10 17:08:48 +02:00
Henk
6143071b27
Make settings folder early
2023-08-08 14:51:15 +02:00
somebody
9704c86aee
Actually do pre-wrap instead
...
just pre makes long texts without whitespace not wrap
2023-08-07 21:13:01 -05:00
somebody
906d1f2522
Merge branch 'united' of https://github.com/henk717/KoboldAI into fixing-time
2023-08-07 16:22:04 -05:00
somebody
7f2085ffe8
UI: Fix token streaming gobbling trailing whitespace
...
which ended up being mostly newlines
2023-08-07 16:00:49 -05:00
somebody
1632f3c684
UI: Change mobile aspect ratio threshold from 7/5 to 5/6
2023-08-07 13:59:57 -05:00
Henk
824050471b
Default to new UI
2023-08-07 20:03:09 +02:00
henk717
0f8cf0dc2c
Merge pull request #433 from LostRuins/concedo_united
...
updated lite to v54
2023-08-07 18:37:50 +02:00
Concedo
06d6364b6b
updated lite to v54
2023-08-07 23:43:27 +08:00
henk717
4f0945e5dc
Merge pull request #426 from one-some/small-shift-fix
...
UI: Replace shift_down code with builtin event.shiftKey
2023-08-06 23:48:41 +02:00
Henk
6e47215e84
Modern Defaults
2023-08-04 22:34:18 +02:00
Henk
87382f0adf
BnB 41
2023-08-04 16:40:20 +02:00
Henk
fe0c391e8f
Only show stopped if started
2023-08-02 10:50:00 +02:00
Henk
c066494c70
No safetensors for TPU
2023-08-02 10:01:35 +02:00
henk717
16017a3afc
Merge pull request #416 from one-some/wi-fixes
...
(mostly) wi fixes and polish
2023-07-31 20:52:48 +02:00
somebody
3950620ce9
Merge branch 'united' of https://github.com/henk717/KoboldAI into wi-fixes
2023-07-31 12:59:29 -05:00
somebody
d4001186df
UI: Hold shift to skip confirmation dialog
...
idea stolen from discord, who likely stole it from somebody else
2023-07-31 12:41:19 -05:00
somebody
23e54b6658
WI: Workaround for Chrome order weirdness
...
Chrome fires `blur()` before deleting nodes, meaning the -1 WI was
getting sent after being deleted, resulting in two
`delete_new_world_info_entry` packets being sent to the browser.
Really, it would be better to not do this full WI reset/sync cycle and
just send state changes and update accordingly. That would stop all the
WI weirdness probably.
2023-07-31 12:30:37 -05:00
henk717
d8ae72a509
Merge pull request #432 from one-some/lock-fix
...
UI: Fix the thingey modal
2023-07-30 23:18:46 +02:00