somebody
c16336f646
Add traceback to debug log on fallback
2023-05-11 17:10:19 -05:00
Henk
e932364a1e
RWKV support
2023-05-11 14:56:12 +02:00
Bogdan Drema
d53726bed6
fix: tpu tokenizers errors
2023-05-08 18:24:34 +01:00
Henk
bb206f598e
Don't load peft when unused
2023-05-06 18:55:26 +02:00
somebody
b7db709c47
PEFT: Change directory structure to be inside model
2023-05-06 11:16:09 -05:00
somebody
f02ddab7c7
Merge branch 'united' of https://github.com/henk717/KoboldAI into peft
2023-05-06 10:47:14 -05:00
Henk
33969b5845
Basic HF code execution support
2023-05-05 17:23:01 +02:00
somebody
35b56117e6
Basic PEFT support
2023-05-03 18:51:01 -05:00
Henk
a87d5d6f23
Remove HF's llama workaround
2023-05-03 20:18:40 +02:00
Llama
35d344b951
Remove torch dependency and more generic dim0 workaround
...
Remove torch dependency from hf.py
Make workaround for dimension zero values of token_ids
more generic to handle every token, not just newlines.
2023-05-03 09:48:16 -07:00
Llama
3768848548
Fix tokenization and whitespace issues with llama-derived models
...
Work around the 'soft' prefix space behavior of sentencepiece.
Override encode to restore the deleted HF support for decode_with_prefix_space.
Override decode to skip the soft space and return true decoded tokens.
Allow submitting chat messages with embedded newlines.
Split sentences between punctuation and whitespace, rather than after whitespace.
Also include trailing quotes and brackets after sentence stoppers.
This avoids splitting ." and .) into two tokens, for instance.
Insert whitespace at the beginning of the author's note, since sentences are
split with leading whitespace.
Remove spurious newlines at the end of chat responses.
2023-05-03 01:27:11 -07:00
somebody
a0f4ab5c6a
Move bad token grabber until after newlinemode has been deduced
2023-05-02 20:23:36 -05:00
somebody
efe268df60
Move overrides to better places
2023-05-02 20:18:33 -05:00
somebody
f6b5548131
Support safetensors in get_sharded_checkpoint_num_tensors
2023-05-01 19:15:27 -05:00
somebody
97e84928ba
Download all shards correctly on aria2 and raise on bad load key
2023-05-01 18:53:36 -05:00
somebody
933dbd634a
HFInferenceModel: Make badwordsids not unique to torch
2023-05-01 17:13:33 -05:00
somebody
ce3d465972
Remove some debug
2023-05-01 17:03:34 -05:00
onesome
467f2f25eb
More loading fixes
2023-04-26 16:58:33 -05:00
onesome
d4f7b60dc9
Fix for multiple paths
2023-04-26 16:49:12 -05:00
onesome
6776a71532
Add more info to custom model error
2023-04-26 16:36:52 -05:00
onesome
bbf4963d6e
Fix custmodpth stuff for hf loading
2023-04-26 16:18:45 -05:00
onesome
c146ae9d84
Delete legacy gpt2 custom loader
2023-04-26 16:07:18 -05:00
onesome
9579298df7
Better fallback
2023-04-25 22:28:07 -05:00
onesome
6e3aebc1ea
Zap debug
2023-04-25 21:13:17 -05:00
onesome
0268305cfe
Change fallback notifications to warnings
2023-04-25 18:26:49 -05:00
onesome
b8bef641ff
Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv
2023-04-25 16:54:53 -05:00
somebody
f9fb5eba89
Remove debug
2023-04-15 18:56:49 -05:00
somebody
5dd67d027a
Workaround for socketio context errors for loading
2023-04-15 18:54:21 -05:00
somebody
08b4e317ff
Fix double slashing
2023-04-15 13:30:05 -05:00
somebody
d3a73aaeba
Fix api
2023-04-15 13:17:20 -05:00
somebody
4dcf570407
Fix legacy model loading
2023-04-15 12:57:35 -05:00
one-some
1b500c7179
Merge pull request #5 from LostRuins/concedo_api
...
Added stop sequences functionality for API calls
2023-04-15 10:51:31 -05:00
somebody
2b950f08d3
Remove legacy no accelerate fallback code
...
Was causing issues with disk cache the old code had a
`and not utils.HAS_ACCELERATE` preceding it (a variable which no longer
exists), and since disk cache is accelerate only, there was no disk
handling code in here. Anyway its bad so blast it
2023-04-15 10:47:31 -05:00
Concedo
9705b7b79c
increase API version (+1 squashed commits)
...
Squashed commits:
[c168c08
] Added stop sequences functionality for API calls
2023-04-15 18:09:53 +08:00
somebody
ea8df4c0d3
Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv
2023-04-14 20:38:56 -05:00
somebody
38c53191d3
possible fix for cache dl thing
2023-04-14 20:25:03 -05:00
somebody
8412f83ce5
Breakmodel: Fix typo
2023-04-03 18:41:18 -05:00
somebody
77f0797b1a
Model fix
2023-04-02 15:47:52 -05:00
somebody
9d70646e4d
Lazyload: Safetensors
2023-04-02 15:40:34 -05:00
somebody
91bb433b5f
GenericTokenizer: Fall back to defined tokenizer
...
Shouldn't be relied on for model-agnostic code, but for loading
processes where you know the tokenizer class used it should be okie
dokie
2023-03-19 19:03:20 -05:00
somebody
ffe85ce8a1
Modeling: Fix logits processors (probs, biasing, lua)
2023-03-17 16:56:47 -05:00
somebody
692dbfeb37
Merge branch 'united' of https://github.com/henk717/KoboldAI into model-structure-and-maybe-rwkv
2023-03-17 16:20:13 -05:00
somebody
8d0bc404a5
Model: More Jax import fixes and formatting
2023-03-17 15:36:44 -05:00
somebody
03af06638c
Modeling: Maybe fix samplers
2023-03-13 20:42:35 -05:00
somebody
b93c339145
Model: Lazyload backends
2023-03-13 20:29:29 -05:00
somebody
938c97b75a
RWKV: Fix yet another typo
2023-03-13 19:39:19 -05:00
somebody
14b2543c7c
RWKV: Fix typo
2023-03-13 19:36:58 -05:00
somebody
b10b201701
Model: Add basic RWKV implementation
2023-03-13 19:34:38 -05:00
somebody
0320678b27
Model: WIP horde and API tests
2023-03-13 14:11:06 -05:00
somebody
cd8ccf0a5e
Modeling: Add seed parameter to raw_generate
...
Yahooo, decoupling from koboldai_vars. This makes the generation test
pass in `test_generation.py`, and makes full determinism outside of
core_generate work.
2023-03-12 21:49:10 -05:00