Commit Graph

1129 Commits

Author SHA1 Message Date
henk717 8ae0c8311b GPU Colab improvements 2022-05-28 19:52:06 +02:00
Henk 0ac36fff37 Merge branch 'united' of https://github.com/henk717/koboldai into united 2022-05-28 19:46:09 +02:00
Henk 97401026dd Nerys Description in Readme 2022-05-28 19:46:05 +02:00
henk717 bab0cd6362 Nerys Description 2022-05-28 19:45:32 +02:00
Henk 4b65ce9c76 1.18 version bump 2022-05-28 19:39:05 +02:00
henk717 6e0510e1f5 Replaced Jax models with HF models where possible 2022-05-27 01:35:15 +02:00
Henk b30370bf4b 2048 maxtoken default
Almost everyone prefers 2048 max tokens because of the superior coherency. It should only be lower due to ram limits, but the menu already shows the optimal ram for 2048. Negatively effected users can turn it down themselves, for everyone else especially on rented machines or colab 2048 is a better default.
2022-05-27 01:23:48 +02:00
henk717 f47db6d155 Nerys 13B 2022-05-27 00:12:35 +02:00
henk717 4482e6db9a
Merge pull request #132 from VE-FORBRYDERNE/gpt2
Fix an error that occurs when loading GPT-2 models
2022-05-20 22:24:24 +01:00
Gnome Ann c692987e40 Fix an error that occurs when loading GPT-2 models
I forgot that this new_rebuild_tensor function's first argument's type
is different when loading GPT-2 models.
2022-05-20 14:54:49 -04:00
henk717 266308b086
Merge pull request #131 from mrseeker/patch-8
Adding Nerys model 13B
2022-05-18 13:55:03 +02:00
Julius ter Pelkwijk 6ae7b48b69
Adding Nerys model 13B 2022-05-18 13:50:57 +02:00
henk717 348fd1c4e2
Merge pull request #130 from mrseeker/patch-8
Adding Nerys model 2.7B
2022-05-16 11:45:01 +02:00
Julius ter Pelkwijk f0df3de610
Adding Nerys model 2.7B 2022-05-16 09:50:45 +02:00
henk717 24a2eb8c0b
Merge pull request #129 from VE-FORBRYDERNE/tqdm
Better model saving and better progress bars
2022-05-14 18:02:41 +02:00
Gnome Ann d4e8f56789 Remove debugging code from tpu_mtj_backend.py 2022-05-14 12:00:44 -04:00
Gnome Ann d5ab3ef5b1 Fix `no attribute get_checkpoint_shard_files` 2022-05-14 11:49:04 -04:00
Gnome Ann 6e82f205b4 Aria2 bug fix for Windows users 2022-05-14 11:44:28 -04:00
henk717 9eaa76c72b Add OPT 13B to the models 2022-05-14 07:55:47 +02:00
Gnome Ann 1476e76cfc Copy fp16 model files instead of resaving them 2022-05-14 00:45:43 -04:00
Gnome Ann 0c5ca5261e Loading a sharded model will now display only one progress bar 2022-05-13 23:32:16 -04:00
Gnome Ann f9f1a5f3a9 Make sure tqdm progress bars display properly in Colab 2022-05-13 17:37:45 -04:00
Gnome Ann 91d3672446 Proper progress bar for aria2 downloads 2022-05-13 17:00:10 -04:00
henk717 7ea0c49c1a
Merge pull request #128 from VE-FORBRYDERNE/opt
OPT breakmodel and TPU support
2022-05-13 18:07:02 +02:00
Gnome Ann a051bf4397 OPT breakmodel bug fix 2022-05-13 10:45:57 -04:00
Gnome Ann 1200173386 Custom badwords for OPT
Generated using:
```
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/opt-350m", fast=False)
badwordsids_opt = [[v] for k, v in tokenizer.vocab.items() if any(c in k for c in "<>[]")]
```
2022-05-13 10:45:28 -04:00
Henk d5fa782483 NS Mode (comment fix) 2022-05-13 10:53:19 +02:00
Henk 8376f12e21 Add NS mode
OPT supports newlines, but it also needs some of the behavior we use in S mode. NS mode is a more limited version of S mode that still handles the </s> token, but instead of replacing it with a new line we replace it empty and newlines are not converted.

In future if your Fairseq style model has newline support use NS mode, while if it needs artifically inserted newlines use S mode. This also means that people finetuning fairseq models to include newlines might benefit from testing their models on ns mode.
2022-05-13 10:44:12 +02:00
Gnome Ann 55079f672a Fix typo in soft prompt patching code 2022-05-13 01:51:55 -04:00
Gnome Ann 29bb3f569b Fix a bug in OPTForCausalLM where self.lm_head is the wrong size 2022-05-13 01:37:17 -04:00
Gnome Ann defbb53b68 OPT breakmodel 2022-05-13 01:03:38 -04:00
Gnome Ann b1d8797a54 Allow TPU Colab to load sharded HF models 2022-05-12 23:51:40 -04:00
Gnome Ann 4fa5f1cd6a Add TPU support for OPT-350M
The 350M model seems to have a different structure than the other ones ???
2022-05-12 22:21:15 -04:00
Gnome Ann dfa2aa7314 Merge branch 'united' into opt 2022-05-12 20:11:53 -04:00
Henk 5c4a087970 Disable S mode for OPT 2022-05-13 01:47:59 +02:00
Gnome Ann f5e689a725 Upload maps/opt.json and update requirements 2022-05-12 19:09:31 -04:00
Henk e98cc3cb16 OPT models 2022-05-12 23:55:21 +02:00
Henk 376e76f5da S mode for OPT 2022-05-12 02:18:14 +02:00
henk717 a1c7017ddc
Merge pull request #127 from VE-FORBRYDERNE/aria2
Handle aria2 properly when it exits with nonzero exit code
2022-05-11 22:57:45 +02:00
Gnome Ann 580dd0b2a3 Handle aria2 properly when it exits with nonzero exit code 2022-05-11 16:23:24 -04:00
henk717 05549de42d
Merge pull request #126 from VE-FORBRYDERNE/aria2
Aria2 downloader bug fixes
2022-05-11 21:58:31 +02:00
Gnome Ann 2ebba9488b Change `force_download` back to False
This is to prevent fully downloaded models from being re-downloaded in
Colab.
2022-05-11 15:51:48 -04:00
Gnome Ann 6d481ca57e Merge branch 'united' into aria2 2022-05-11 15:51:11 -04:00
Gnome Ann c65272052a aria2 now downloads to different filename and renames afterwards
This is to match the behaviour of the original transformers downloader
in order to deal with the rare case of someone downloading a model using
aria2, cancelling before it finishes, and then attempting to resume the
download with the normal transformers downloader.
2022-05-11 15:45:38 -04:00
Henk 6d27084e8a Better Aria2 Defaults
Trunc prevents slow allocation on windows, force_download=True has proven a more reliable default. Since models are converted to local formats it does not impact local users. And because -c is used the impact of checking if the model is correct is desirable and minimal.
2022-05-11 21:38:33 +02:00
Gnome Ann 7a3f865e3f Prevent aria2 from resuming cancelled downloads
Resumed downloads tend to be very slow.

The original transformers downloader didn't allow resuming downloads
either.
2022-05-11 15:14:37 -04:00
Gnome Ann c81f3bd084 Use `--file-allocation=trunc` instead of `--file-allocation=none` 2022-05-11 14:51:43 -04:00
Gnome Ann f96c878d83 Use aria2 even when all model files are already in cache
This allows aria2 to continue downloading a pytorch_model.bin after a
cancelled download.
2022-05-11 14:43:56 -04:00
Gnome Ann f60c7d8492 Fix the behaviour of `aria2_hook()` when using `force_download` 2022-05-11 14:41:34 -04:00
Gnome Ann 5732a8f15a Don't use `aria2_hook()` if `force_download=True` is used 2022-05-11 14:40:31 -04:00