henk717
8ae0c8311b
GPU Colab improvements
2022-05-28 19:52:06 +02:00
Henk
0ac36fff37
Merge branch 'united' of https://github.com/henk717/koboldai into united
2022-05-28 19:46:09 +02:00
Henk
97401026dd
Nerys Description in Readme
2022-05-28 19:46:05 +02:00
henk717
bab0cd6362
Nerys Description
2022-05-28 19:45:32 +02:00
Henk
4b65ce9c76
1.18 version bump
2022-05-28 19:39:05 +02:00
henk717
6e0510e1f5
Replaced Jax models with HF models where possible
2022-05-27 01:35:15 +02:00
Henk
b30370bf4b
2048 maxtoken default
...
Almost everyone prefers 2048 max tokens because of the superior coherency. It should only be lower due to ram limits, but the menu already shows the optimal ram for 2048. Negatively effected users can turn it down themselves, for everyone else especially on rented machines or colab 2048 is a better default.
2022-05-27 01:23:48 +02:00
henk717
f47db6d155
Nerys 13B
2022-05-27 00:12:35 +02:00
henk717
4482e6db9a
Merge pull request #132 from VE-FORBRYDERNE/gpt2
...
Fix an error that occurs when loading GPT-2 models
2022-05-20 22:24:24 +01:00
Gnome Ann
c692987e40
Fix an error that occurs when loading GPT-2 models
...
I forgot that this new_rebuild_tensor function's first argument's type
is different when loading GPT-2 models.
2022-05-20 14:54:49 -04:00
henk717
266308b086
Merge pull request #131 from mrseeker/patch-8
...
Adding Nerys model 13B
2022-05-18 13:55:03 +02:00
Julius ter Pelkwijk
6ae7b48b69
Adding Nerys model 13B
2022-05-18 13:50:57 +02:00
henk717
348fd1c4e2
Merge pull request #130 from mrseeker/patch-8
...
Adding Nerys model 2.7B
2022-05-16 11:45:01 +02:00
Julius ter Pelkwijk
f0df3de610
Adding Nerys model 2.7B
2022-05-16 09:50:45 +02:00
henk717
24a2eb8c0b
Merge pull request #129 from VE-FORBRYDERNE/tqdm
...
Better model saving and better progress bars
2022-05-14 18:02:41 +02:00
Gnome Ann
d4e8f56789
Remove debugging code from tpu_mtj_backend.py
2022-05-14 12:00:44 -04:00
Gnome Ann
d5ab3ef5b1
Fix `no attribute get_checkpoint_shard_files`
2022-05-14 11:49:04 -04:00
Gnome Ann
6e82f205b4
Aria2 bug fix for Windows users
2022-05-14 11:44:28 -04:00
henk717
9eaa76c72b
Add OPT 13B to the models
2022-05-14 07:55:47 +02:00
Gnome Ann
1476e76cfc
Copy fp16 model files instead of resaving them
2022-05-14 00:45:43 -04:00
Gnome Ann
0c5ca5261e
Loading a sharded model will now display only one progress bar
2022-05-13 23:32:16 -04:00
Gnome Ann
f9f1a5f3a9
Make sure tqdm progress bars display properly in Colab
2022-05-13 17:37:45 -04:00
Gnome Ann
91d3672446
Proper progress bar for aria2 downloads
2022-05-13 17:00:10 -04:00
henk717
7ea0c49c1a
Merge pull request #128 from VE-FORBRYDERNE/opt
...
OPT breakmodel and TPU support
2022-05-13 18:07:02 +02:00
Gnome Ann
a051bf4397
OPT breakmodel bug fix
2022-05-13 10:45:57 -04:00
Gnome Ann
1200173386
Custom badwords for OPT
...
Generated using:
```
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/opt-350m", fast=False)
badwordsids_opt = [[v] for k, v in tokenizer.vocab.items() if any(c in k for c in "<>[]")]
```
2022-05-13 10:45:28 -04:00
Henk
d5fa782483
NS Mode (comment fix)
2022-05-13 10:53:19 +02:00
Henk
8376f12e21
Add NS mode
...
OPT supports newlines, but it also needs some of the behavior we use in S mode. NS mode is a more limited version of S mode that still handles the </s> token, but instead of replacing it with a new line we replace it empty and newlines are not converted.
In future if your Fairseq style model has newline support use NS mode, while if it needs artifically inserted newlines use S mode. This also means that people finetuning fairseq models to include newlines might benefit from testing their models on ns mode.
2022-05-13 10:44:12 +02:00
Gnome Ann
55079f672a
Fix typo in soft prompt patching code
2022-05-13 01:51:55 -04:00
Gnome Ann
29bb3f569b
Fix a bug in OPTForCausalLM where self.lm_head is the wrong size
2022-05-13 01:37:17 -04:00
Gnome Ann
defbb53b68
OPT breakmodel
2022-05-13 01:03:38 -04:00
Gnome Ann
b1d8797a54
Allow TPU Colab to load sharded HF models
2022-05-12 23:51:40 -04:00
Gnome Ann
4fa5f1cd6a
Add TPU support for OPT-350M
...
The 350M model seems to have a different structure than the other ones ???
2022-05-12 22:21:15 -04:00
Gnome Ann
dfa2aa7314
Merge branch 'united' into opt
2022-05-12 20:11:53 -04:00
Henk
5c4a087970
Disable S mode for OPT
2022-05-13 01:47:59 +02:00
Gnome Ann
f5e689a725
Upload maps/opt.json and update requirements
2022-05-12 19:09:31 -04:00
Henk
e98cc3cb16
OPT models
2022-05-12 23:55:21 +02:00
Henk
376e76f5da
S mode for OPT
2022-05-12 02:18:14 +02:00
henk717
a1c7017ddc
Merge pull request #127 from VE-FORBRYDERNE/aria2
...
Handle aria2 properly when it exits with nonzero exit code
2022-05-11 22:57:45 +02:00
Gnome Ann
580dd0b2a3
Handle aria2 properly when it exits with nonzero exit code
2022-05-11 16:23:24 -04:00
henk717
05549de42d
Merge pull request #126 from VE-FORBRYDERNE/aria2
...
Aria2 downloader bug fixes
2022-05-11 21:58:31 +02:00
Gnome Ann
2ebba9488b
Change `force_download` back to False
...
This is to prevent fully downloaded models from being re-downloaded in
Colab.
2022-05-11 15:51:48 -04:00
Gnome Ann
6d481ca57e
Merge branch 'united' into aria2
2022-05-11 15:51:11 -04:00
Gnome Ann
c65272052a
aria2 now downloads to different filename and renames afterwards
...
This is to match the behaviour of the original transformers downloader
in order to deal with the rare case of someone downloading a model using
aria2, cancelling before it finishes, and then attempting to resume the
download with the normal transformers downloader.
2022-05-11 15:45:38 -04:00
Henk
6d27084e8a
Better Aria2 Defaults
...
Trunc prevents slow allocation on windows, force_download=True has proven a more reliable default. Since models are converted to local formats it does not impact local users. And because -c is used the impact of checking if the model is correct is desirable and minimal.
2022-05-11 21:38:33 +02:00
Gnome Ann
7a3f865e3f
Prevent aria2 from resuming cancelled downloads
...
Resumed downloads tend to be very slow.
The original transformers downloader didn't allow resuming downloads
either.
2022-05-11 15:14:37 -04:00
Gnome Ann
c81f3bd084
Use `--file-allocation=trunc` instead of `--file-allocation=none`
2022-05-11 14:51:43 -04:00
Gnome Ann
f96c878d83
Use aria2 even when all model files are already in cache
...
This allows aria2 to continue downloading a pytorch_model.bin after a
cancelled download.
2022-05-11 14:43:56 -04:00
Gnome Ann
f60c7d8492
Fix the behaviour of `aria2_hook()` when using `force_download`
2022-05-11 14:41:34 -04:00
Gnome Ann
5732a8f15a
Don't use `aria2_hook()` if `force_download=True` is used
2022-05-11 14:40:31 -04:00