Commit Graph

4551 Commits

Author SHA1 Message Date
Llama
2c48e05f7c Add exllama dependency back to requirements. 2023-08-28 09:52:31 -07:00
Llama
6151cbd053 Merge branch 'united' into merge/united-exllama 2023-08-28 09:32:19 -07:00
Llama
5229987ab7 Merge pull request #66 from pi6am/feat/exllama-config
Modify exllama to load unrenamed gptq quantized models
2023-08-28 00:09:50 -07:00
Llama
554af7b175 Modify exllama to load unrenamed gptq quantized models
Read config.json and enable exllama loading if the model has a
`quantization_config` with `quant_methdod` of `gptq`. Note that this
implementation is limited and only supports model.safetensors.
That said, this supports loading popular gptq quantized models
without renaming or symlinking the model file.
2023-08-27 23:56:02 -07:00
Llama
812df5ea56 Merge pull request #65 from pi6am/feat/exllama-badwords
Add the eos token to exllama bad words.
2023-08-27 17:03:25 -07:00
Llama
08ff7c138c Add the eos token to exllama bad words.
The bos token was already hardcoded as a bad word id.
Store badwords in a list and iterate over them during generation.
Add the Llama eos token to the list of bad words.
Also support "single line mode", which adds newline (13) to badwords.
2023-08-27 16:34:52 -07:00
Llama
0d150e412e Merge pull request #64 from pi6am/fix/multinomial-workaround
Resample to work around a bug in torch.multinomial
2023-08-26 22:42:21 -07:00
Llama
b7e38b4757 Resample to work around a bug in torch.multinomial
There is a bug in PyTorch 2.0.1 that allows torch.multinomial to
sometimes choose elements that have zero probability. Since
this is uncommon we can continue to use torch.multinomial as
long as we verify that the results are valid. If they aren't,
try again until the probability of each selected token is positive.
2023-08-26 22:26:26 -07:00
Llama
b1895de518 Merge pull request #63 from pi6am/feat/exllama-stoppers
Add stopper hooks suppport to exllama
2023-08-22 23:14:00 -07:00
Llama
b96d5d8646 Add stopper hooks suppport to exllama 2023-08-22 23:06:16 -07:00
0cc4m
22fd49937a Merge pull request #62 from pi6am/fix/exllama-eos-space
Strip the eos token from exllama generations.
2023-08-22 08:06:02 +02:00
Llama
070cfd339a Strip the eos token from exllama generations.
The end-of-sequence (</s>) token indicates the end of a generation.
When a token sequence containing </s> is decoded, an extra (wrong)
space is inserted at the beginning of the generation. To avoid this,
strip the eos token out of the result before returning it.
The eos token was getting stripped later, so this doesn't change
the output except to avoid the spurious leading space.
2023-08-19 17:40:23 -07:00
Llama
dda5acd5d5 Merge pull request #33 from henk717/united
Merge united.
2023-08-18 13:19:24 -07:00
Henk
87934ee393 AutoGPTQ Exllama compile 2023-08-18 21:49:05 +02:00
henk717
3f28503b87 Update huggingface.yml 2023-08-15 22:19:18 +02:00
henk717
1c65528dbf Fix BnB on Colab 2023-08-14 03:28:29 +02:00
Llama
d8d9890f46 Merge pull request #32 from henk717/united
Merge united
2023-08-13 14:10:35 -07:00
Henk
e90903946d AutoGPTQ updates 2023-08-13 17:36:17 +02:00
Henk
116a88b46c Better stable diffusion 2023-08-13 16:45:31 +02:00
henk717
89a805a0cc Merge pull request #411 from one-some/fixing-time
Fix most of the API
2023-08-13 13:43:13 +02:00
henk717
dae9a6eb5a Merge branch 'KoboldAI:main' into united 2023-08-12 02:18:11 +02:00
henk717
ee93fe6e4a Add model cleaner 2023-08-11 22:39:49 +02:00
Henk
1e87c05e68 Fix discord link 2023-08-11 17:36:41 +02:00
henk717
a9dbe2837e Merge branch 'KoboldAI:main' into united 2023-08-11 00:36:09 +02:00
henk717
9cb93d6b4c Add some 13B's for easier beta testing 2023-08-10 23:56:44 +02:00
henk717
2938a9993a Merge pull request #434 from one-some/united
UI: Change mobile aspect ratio threshold from 7/5 to 5/6
2023-08-10 19:57:56 +02:00
henk717
5f1be7c482 Merge pull request #435 from one-some/token-stream-newline-fix
UI: Fix token streaming gobbling trailing whitespace
2023-08-10 19:55:50 +02:00
Henk
2628726e1c Dont use exllama on fail 2023-08-10 19:34:08 +02:00
Henk
9c7ebe3b04 Better AutoGPTQ fallback 2023-08-10 18:10:48 +02:00
Henk
f2d7ef3aca AutoGPTQ breakmodel 2023-08-10 17:41:31 +02:00
Henk
54addfc234 AutoGPTQ fallback 2023-08-10 17:18:53 +02:00
Henk
1b253ce95f 4-bit dependency fixes 2023-08-10 17:08:48 +02:00
Henk
6143071b27 Make settings folder early 2023-08-08 14:51:15 +02:00
somebody
9704c86aee Actually do pre-wrap instead
just pre makes long texts without whitespace not wrap
2023-08-07 21:13:01 -05:00
somebody
906d1f2522 Merge branch 'united' of https://github.com/henk717/KoboldAI into fixing-time 2023-08-07 16:22:04 -05:00
somebody
7f2085ffe8 UI: Fix token streaming gobbling trailing whitespace
which ended up being mostly newlines
2023-08-07 16:00:49 -05:00
somebody
1632f3c684 UI: Change mobile aspect ratio threshold from 7/5 to 5/6 2023-08-07 13:59:57 -05:00
Henk
824050471b Default to new UI 2023-08-07 20:03:09 +02:00
henk717
0f8cf0dc2c Merge pull request #433 from LostRuins/concedo_united
updated lite to v54
2023-08-07 18:37:50 +02:00
Concedo
06d6364b6b updated lite to v54 2023-08-07 23:43:27 +08:00
henk717
4f0945e5dc Merge pull request #426 from one-some/small-shift-fix
UI: Replace shift_down code with builtin event.shiftKey
2023-08-06 23:48:41 +02:00
Henk
6e47215e84 Modern Defaults 2023-08-04 22:34:18 +02:00
Henk
87382f0adf BnB 41 2023-08-04 16:40:20 +02:00
Henk
fe0c391e8f Only show stopped if started 2023-08-02 10:50:00 +02:00
Henk
c066494c70 No safetensors for TPU 2023-08-02 10:01:35 +02:00
henk717
16017a3afc Merge pull request #416 from one-some/wi-fixes
(mostly) wi fixes and polish
2023-07-31 20:52:48 +02:00
somebody
3950620ce9 Merge branch 'united' of https://github.com/henk717/KoboldAI into wi-fixes 2023-07-31 12:59:29 -05:00
somebody
d4001186df UI: Hold shift to skip confirmation dialog
idea stolen from discord, who likely stole it from somebody else
2023-07-31 12:41:19 -05:00
somebody
23e54b6658 WI: Workaround for Chrome order weirdness
Chrome fires `blur()` before deleting nodes, meaning the -1 WI was
getting sent after being deleted, resulting in two
`delete_new_world_info_entry` packets being sent to the browser.

Really, it would be better to not do this full WI reset/sync cycle and
just send state changes and update accordingly. That would stop all the
WI weirdness probably.
2023-07-31 12:30:37 -05:00
henk717
d8ae72a509 Merge pull request #432 from one-some/lock-fix
UI: Fix the thingey modal
2023-07-30 23:18:46 +02:00