Compare commits
99 Commits
Author | SHA1 | Date |
---|---|---|
henk717 | f49d763e2a | |
henk717 | fd24d95981 | |
henk717 | 61a0042c66 | |
henk717 | 8b7ab2f93b | |
henk717 | 0ea758b789 | |
henk717 | 2db1812ee4 | |
anhad | 3287328fe4 | |
anhad | a92951f47e | |
henk717 | 7d39b353c0 | |
henk717 | 58b4c48fdb | |
henk717 | bf61e5ef02 | |
Henk | 386fd1f034 | |
henk717 | d86f61151b | |
henk717 | ebab774aab | |
henk717 | ee93fe6e4a | |
henk717 | 9cb93d6b4c | |
henk717 | d6b1ff513d | |
henk717 | c11a269493 | |
henk717 | 148f900324 | |
henk717 | b66110ea54 | |
henk717 | d2b399d7bc | |
henk717 | f2b643a639 | |
Henk | 1499763472 | |
SmolBleat | 692fe2e5ee | |
henk717 | c3bf89a94f | |
henk717 | 1ae1d499e8 | |
henk717 | b808f039ab | |
henk717 | d88f109073 | |
henk717 | b4cb09590f | |
henk717 | 5f0e2001a7 | |
henk717 | dddde7dbc3 | |
Bogdan Drema | 92a0bf9524 | |
henk717 | e4c15fe1f6 | |
henk717 | b432d55d99 | |
henk717 | ee6e7e9b72 | |
Syler Clayton | 860b697a70 | |
henk717 | 29c2d4b7a6 | |
henk717 | fd12214091 | |
henk717 | bb51127bbf | |
henk717 | 72b4669563 | |
henk717 | ab779efe0e | |
YellowRoseCx | 3c48a77a52 | |
YellowRoseCx | f826930c02 | |
henk717 | 66264d38c4 | |
henk717 | 94eb8ff825 | |
Henk | 219b824b9b | |
henk717 | ffa5c0bc13 | |
henk717 | 487739911a | |
Henk | 2ed6cdb411 | |
henk717 | 142cb354f9 | |
Henk | 93bf023bd7 | |
henk717 | 750cc3d2dc | |
Henk | 0e06fc371f | |
Divided by Zer0 | 6426e3ca24 | |
Divided by Zer0 | 2de9672b95 | |
henk717 | c27faf56e6 | |
henk717 | 5962a6cb4f | |
Henk | 1378fe8beb | |
waffshappen | a0d4497c95 | |
waffshappen | d026bd79cb | |
Henk | cc01ad730a | |
Henk | b58daa1ba1 | |
henk717 | 661bd5c99e | |
Henk | 257a535be5 | |
Henk | 739cccd8ed | |
henk717 | e9cf9fa6d0 | |
henk717 | 031c06347f | |
henk717 | a185cbd015 | |
henk717 | a046db4ded | |
henk717 | 47a27fa906 | |
henk717 | 24f50d6fb7 | |
henk717 | 22acde1ab7 | |
Henk | e9859cf17d | |
Henk | 307fc97b9d | |
henk717 | 4a88e41d14 | |
henk717 | 1628b789d1 | |
Henk | 857476ef6b | |
Henk | 7fc5c46c1d | |
henk717 | 1dbc987048 | |
henk717 | a04f99891f | |
henk717 | 75fecb86cc | |
Gouvernathor | a4f49c097a | |
Gouvernathor | 55cf5f2f67 | |
henk717 | 23b2d3a99e | |
somebody | 9efbe381cf | |
henk717 | 0a926e41e4 | |
vfbd | 33ba3e7e27 | |
Henk | eeb1774d42 | |
Henk | 9a8e8a0005 | |
henk717 | dd7363548c | |
henk717 | 686845cd21 | |
somebody | e6656d68a1 | |
henk717 | 55ef53f39b | |
henk717 | 0b3e22ee13 | |
Henk | d0cb463c53 | |
henk717 | e8245478d6 | |
henk717 | f72ceeadd0 | |
henk717 | 04d9172fcd | |
vfbd | 9a3f0eaab2 |
|
@ -48,7 +48,7 @@ If you would like to play KoboldAI online for free on a powerful computer you ca
|
|||
|
||||
Each edition features different models and requires different hardware to run, this means that if you are unable to obtain a TPU or a GPU you might still be able to use the other version. The models you can use are listed underneath the edition. To open a Colab click the big link featuring the editions name.
|
||||
|
||||
## [TPU Edition Model Descriptions](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/TPU.ipynb)
|
||||
## [Models the TPU can run:](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/TPU.ipynb)
|
||||
|
||||
| Model | Style | Description |
|
||||
| --- | --- | --- |
|
||||
|
@ -64,22 +64,26 @@ Each edition features different models and requires different hardware to run, t
|
|||
| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-13B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger 20B model from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |
|
||||
| [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6B) by EleutherAI | Generic | This model serves as the basis for most other 6B models (Some being based on Fairseq Dense instead). Being trained on the Pile and not biased towards anything in particular it is suitable for a variety of tasks such as writing, Q&A and coding tasks. You will likely get better result with larger generic models or finetuned models. |
|
||||
|
||||
## [GPU Edition Model Descriptions](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/GPU.ipynb)
|
||||
## [Models the Colab GPU can run:](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/GPU.ipynb)
|
||||
|
||||
| Model | Style | Description |
|
||||
| --- | --- | --- |
|
||||
| [Nerys](https://huggingface.co/KoboldAI/fairseq-dense-2.7B-Nerys) by Mr Seeker | Novel/Adventure | Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person. |
|
||||
| [Erebus](https://huggingface.co/KoboldAI/OPT-2.7B-Erebus) by Mr Seeker | NSFW | Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license. |
|
||||
| [Tiefighter 13B by KoboldAI](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) | Hybrid | Tiefighter 13B is a very versitile fiction Hybrid, it can write, chat and play adventure games and can also answer regular instructions (Although we do not recommend this model for factual use due to its fictional nature). This is an excellent starting model, for the best results avoid using Second person writing in your chats unless you are wanting it to become a text adventure.|
|
||||
| [Janeway](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Janeway) by Mr Seeker | Novel | Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity. |
|
||||
| [Picard](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Picard) by Mr Seeker | Novel | Picard is a model trained for SFW Novels based on Neo 2.7B. It is focused on Novel style writing without the NSFW bias. While the name suggests a sci-fi model this model is designed for Novels of a variety of genre's. It is meant to be used in KoboldAI's regular mode. |
|
||||
| [AID](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-AID) by melastacho | Adventure | Also know as Adventure 2.7B this is a clone of the AI Dungeon Classic model and is best known for the epic wackey adventures that AI Dungeon Classic players love. |
|
||||
| [Horni LN](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni-LN) by finetune | Novel | This model is based on Horni 2.7B and retains its NSFW knowledge, but was then further biased towards SFW novel stories. If you seek a balance between a SFW Novel model and a NSFW model this model should be a good choice. |
|
||||
| [Horni](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni) by finetune | NSFW | This model is tuned on Literotica to produce a Novel style model biased towards NSFW content. Can still be used for SFW stories but will have a bias towards NSFW content. It is meant to be used in KoboldAI's regular mode. |
|
||||
| [Shinen](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Shinen) by Mr Seeker | NSFW | Shinen is an alternative to the Horni model designed to be more explicit. If Horni is to tame for you Shinen might produce better results. While it is a Novel model it is unsuitable for SFW stories due to its heavy NSFW bias. Shinen will not hold back. It is meant to be used in KoboldAI's regular mode. |
|
||||
| [OPT](https://huggingface.co/facebook/opt-2.7b) by Metaseq | Generic | OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes. |
|
||||
| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-2.7B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger models from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |
|
||||
| [MythoMax 13B](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) by Gryphe | Roleplay | An improved, potentially even perfected variant of MythoMix, my MythoLogic-L2 and Huginn merge using a highly experimental tensor type merge technique¹. |
|
||||
| [Holomax 13B by KoboldAI](https://huggingface.co/KoboldAI/LLaMA2-13B-Holomax) | Adventure | This is an expansion merge to the well-praised MythoMax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%). The goal of this model is to enhance story-writing capabilities while preserving the desirable traits of the MythoMax model as much as possible (It does limit chat reply length). |
|
||||
| [Airoboros 13B](https://huggingface.co/jondurbin/airoboros-13b) by Jon Durbin | Generic | This is an instruction fine-tuned llama-2 model, using synthetic instructions generated by airoboros⁵. |
|
||||
| [Emerhyst 13B](https://huggingface.co/Undi95/Emerhyst-13B) by Undi | Roleplay | An attempt using BlockMerge_Gradient to get better result. In addition, LimaRP v3 was used⁷. |
|
||||
| [Chronos 13B](https://huggingface.co/elinas/chronos-13b) by Elinas | Generic | This model is primarily focused on chat, roleplay, and storywriting, but can accomplish other tasks such as simple reasoning and coding. Chronos generates very long outputs with coherent text, largely due to the human inputs it was trained on. |
|
||||
| [Spring Dragon by Henk717](https://huggingface.co/Henk717/spring-dragon) | Adventure | This model is a recreation attempt of the AI Dungeon 2 Dragon model. To achieve this, the "text_adventures.txt" dataset was used, which was bundled with the original AI Dungeon 2 GitHub release prior to the online service. It is worth noting that the same dataset file was used to create the Dragon model, where Dragon is a GPT-3 175B Davinci model from 2020. |
|
||||
| [Holodeck By KoboldAI](https://huggingface.co/KoboldAI/LLAMA2-13B-Holodeck-1) | Adventure |LLAMA2 13B-Holodeck is a finetune created using Meta's llama 2 model.The training data contains around 3000 ebooks in various genres. Most parts of the dataset have been prepended using the following text: [Genre: <genre1>, <genre2>|
|
||||
| [Neo](https://huggingface.co/EleutherAI/gpt-neo-2.7B) by EleutherAI | Generic | This is the base model for all the other 2.7B models, it is best used when you have a use case that we have no other models available for, such as writing blog articles or programming. It can also be a good basis for the experience of some of the softprompts if your softprompt is not about a subject the other models cover. |
|
||||
|
||||
| [Various 2.7b models]() by various | Various smaller models are also possible to load in GPU colab. | |
|
||||
### Styles
|
||||
|
||||
| Type | Description |
|
||||
|
@ -105,7 +109,7 @@ KoboldAI has a large number of dependencies you will need to install on your com
|
|||
|
||||
### Downloading the latest version of KoboldAI
|
||||
|
||||
KoboldAI is a rolling release on our github, the code you see is also the game. You can the software by clicking on the green Code button at the top of the page and clicking Download ZIP.
|
||||
KoboldAI is a rolling release on our github, the code you see is also the game. You can download the software by clicking on the green Code button at the top of the page and clicking Download ZIP, or use the `git clone` command instead. Then, on Windows you need to you run install_requirements.bat (using admin mode is recommanded to avoid errors), and once it's done, or if you're on Linux, either play.bat/sh or remote-play.bat/sh to run it.
|
||||
|
||||
The easiest way for Windows users is to use the [offline installer](https://sourceforge.net/projects/koboldai/files/latest/download) below.
|
||||
|
||||
|
@ -228,4 +232,4 @@ Did we miss your contribution? Feel free to issue a commit adding your name to t
|
|||
|
||||
KoboldAI is licensed with a AGPL license, in short this means that it can be used by anyone for any purpose. However, if you decide to make a publicly available instance your users are entitled to a copy of the source code including all modifications that you have made (which needs to be available trough an interface such as a button on your website), you may also not distribute this project in a form that does not contain the source code (Such as compiling / encrypting the code and distributing this version without also distributing the source code that includes the changes that you made. You are allowed to distribute this in a closed form if you also provide a separate archive with the source code.).
|
||||
|
||||
umamba.exe is bundled for convenience because we observed that many of our users had trouble with command line download methods, it is not part of our project and does not fall under the AGPL license. It is licensed under the BSD-3-Clause license. Other files with differing licenses will have a reference or embedded version of this license within the file. It has been sourced from https://anaconda.org/conda-forge/micromamba/files and its source code can be found here : https://github.com/mamba-org/mamba/tree/master/micromamba
|
||||
umamba.exe is bundled for convenience because we observed that many of our users had trouble with command line download methods, it is not part of our project and does not fall under the AGPL license. It is licensed under the BSD-3-Clause license. Other files with differing licenses will have a reference or embedded version of this license within the file. It has been sourced from https://anaconda.org/conda-forge/micromamba/files and its source code can be found here : https://github.com/mamba-org/mamba/tree/master/micromamba
|
230
aiserver.py
230
aiserver.py
|
@ -165,13 +165,16 @@ model_menu = {
|
|||
],
|
||||
'nsfwlist': [
|
||||
["Erebus 20B (NSFW)", "KoboldAI/GPT-NeoX-20B-Erebus", "64GB", False],
|
||||
["Nerybus 13B (NSFW)", "KoboldAI/OPT-13B-Nerybus-Mix", "32GB", False],
|
||||
["Erebus 13B (NSFW)", "KoboldAI/OPT-13B-Erebus", "32GB", False],
|
||||
["Shinen FSD 13B (NSFW)", "KoboldAI/fairseq-dense-13B-Shinen", "32GB", False],
|
||||
["Nerybus 6.7B (NSFW)", "KoboldAI/OPT-6.7B-Nerybus-Mix", "16GB", False],
|
||||
["Erebus 6.7B (NSFW)", "KoboldAI/OPT-6.7B-Erebus", "16GB", False],
|
||||
["Shinen FSD 6.7B (NSFW)", "KoboldAI/fairseq-dense-6.7B-Shinen", "16GB", False],
|
||||
["Lit V2 6B (NSFW)", "hakurei/litv2-6B-rev3", "16GB", False],
|
||||
["Lit 6B (NSFW)", "hakurei/lit-6B", "16GB", False],
|
||||
["Shinen 6B (NSFW)", "KoboldAI/GPT-J-6B-Shinen", "16GB", False],
|
||||
["Nerybus 2.7B (NSFW)", "KoboldAI/OPT-2.7B-Nerybus-Mix", "8GB", False],
|
||||
["Erebus 2.7B (NSFW)", "KoboldAI/OPT-2.7B-Erebus", "8GB", False],
|
||||
["Horni 2.7B (NSFW)", "KoboldAI/GPT-Neo-2.7B-Horni", "8GB", False],
|
||||
["Shinen 2.7B (NSFW)", "KoboldAI/GPT-Neo-2.7B-Shinen", "8GB", False],
|
||||
|
@ -1511,7 +1514,7 @@ def get_model_info(model, directory=""):
|
|||
models_on_url = True
|
||||
url = True
|
||||
key = True
|
||||
default_url = 'https://koboldai.net'
|
||||
default_url = 'https://horde.koboldai.net'
|
||||
multi_online_models = True
|
||||
if path.exists(get_config_filename(model)):
|
||||
with open(get_config_filename(model), "r") as file:
|
||||
|
@ -1590,13 +1593,13 @@ def get_layer_count(model, directory=""):
|
|||
model = directory
|
||||
from transformers import AutoConfig
|
||||
if(os.path.isdir(model.replace('/', '_'))):
|
||||
model_config = AutoConfig.from_pretrained(model.replace('/', '_'), revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained(model.replace('/', '_'), revision=args.revision, cache_dir="cache")
|
||||
elif(os.path.isdir("models/{}".format(model.replace('/', '_')))):
|
||||
model_config = AutoConfig.from_pretrained("models/{}".format(model.replace('/', '_')), revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained("models/{}".format(model.replace('/', '_')), revision=args.revision, cache_dir="cache")
|
||||
elif(os.path.isdir(directory)):
|
||||
model_config = AutoConfig.from_pretrained(directory, revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained(directory, revision=args.revision, cache_dir="cache")
|
||||
else:
|
||||
model_config = AutoConfig.from_pretrained(model, revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained(model, revision=args.revision, cache_dir="cache")
|
||||
try:
|
||||
if ((utils.HAS_ACCELERATE and model_config.model_type != 'gpt2') or model_config.model_type in ("gpt_neo", "gptj", "xglm", "opt")) and not vars.nobreakmodel:
|
||||
return utils.num_layers(model_config)
|
||||
|
@ -1671,7 +1674,7 @@ def get_cluster_models(msg):
|
|||
# Get list of models from public cluster
|
||||
logger.init("KAI Horde Models", status="Retrieving")
|
||||
try:
|
||||
req = requests.get("{}/api/v1/models".format(url))
|
||||
req = requests.get(f"{url}/api/v2/status/models?type=text")
|
||||
except requests.exceptions.ConnectionError:
|
||||
logger.init_err("KAI Horde Models", status="Failed")
|
||||
logger.error("Provided KoboldAI Horde URL unreachable")
|
||||
|
@ -1687,10 +1690,11 @@ def get_cluster_models(msg):
|
|||
engines = req.json()
|
||||
logger.debug(engines)
|
||||
try:
|
||||
engines = [[en, en] for en in engines]
|
||||
engines = [[en["name"], en["name"]] for en in engines]
|
||||
except:
|
||||
logger.error(engines)
|
||||
raise
|
||||
logger.debug(engines)
|
||||
|
||||
online_model = ""
|
||||
changed=False
|
||||
|
@ -1936,34 +1940,26 @@ def patch_transformers():
|
|||
|
||||
from torch.nn import functional as F
|
||||
|
||||
class ProbabilityVisualizerLogitsProcessor(LogitsProcessor):
|
||||
def __init__(self):
|
||||
pass
|
||||
def visualize_probabilities(scores: torch.FloatTensor) -> None:
|
||||
assert scores.ndim == 2
|
||||
|
||||
def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor:
|
||||
assert scores.ndim == 2
|
||||
assert input_ids.ndim == 2
|
||||
if vars.numseqs > 1 or not vars.show_probs:
|
||||
return
|
||||
|
||||
if vars.numseqs > 1 or not vars.show_probs:
|
||||
return scores
|
||||
probs = F.softmax(scores, dim = -1).cpu().numpy()[0]
|
||||
token_prob_info = []
|
||||
for token_id, score in sorted(enumerate(probs), key=lambda x: x[1], reverse=True)[:8]:
|
||||
token_prob_info.append({
|
||||
"tokenId": token_id,
|
||||
"decoded": utils.decodenewlines(tokenizer.decode(token_id)),
|
||||
"score": float(score),
|
||||
})
|
||||
|
||||
probs = F.softmax(scores, dim = -1).cpu().numpy()[0]
|
||||
|
||||
token_prob_info = []
|
||||
for token_id, score in sorted(enumerate(probs), key=lambda x: x[1], reverse=True)[:8]:
|
||||
token_prob_info.append({
|
||||
"tokenId": token_id,
|
||||
"decoded": utils.decodenewlines(tokenizer.decode(token_id)),
|
||||
"score": float(score),
|
||||
})
|
||||
|
||||
vars.token_stream_queue.probability_buffer = token_prob_info
|
||||
return scores
|
||||
vars.token_stream_queue.probability_buffer = token_prob_info
|
||||
|
||||
def new_get_logits_processor(*args, **kwargs) -> LogitsProcessorList:
|
||||
processors = new_get_logits_processor.old_get_logits_processor(*args, **kwargs)
|
||||
processors.insert(0, LuaLogitsProcessor())
|
||||
processors.append(ProbabilityVisualizerLogitsProcessor())
|
||||
return processors
|
||||
new_get_logits_processor.old_get_logits_processor = transformers.generation_utils.GenerationMixin._get_logits_processor
|
||||
transformers.generation_utils.GenerationMixin._get_logits_processor = new_get_logits_processor
|
||||
|
@ -1985,6 +1981,7 @@ def patch_transformers():
|
|||
sampler_order = [6] + sampler_order
|
||||
for k in sampler_order:
|
||||
scores = self.__warper_list[k](input_ids, scores, *args, **kwargs)
|
||||
visualize_probabilities(scores)
|
||||
return scores
|
||||
|
||||
def new_get_logits_warper(beams: int = 1,) -> LogitsProcessorList:
|
||||
|
@ -2238,19 +2235,19 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
from transformers import AutoConfig
|
||||
if(os.path.isdir(vars.custmodpth.replace('/', '_'))):
|
||||
try:
|
||||
model_config = AutoConfig.from_pretrained(vars.custmodpth.replace('/', '_'), revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained(vars.custmodpth.replace('/', '_'), revision=args.revision, cache_dir="cache")
|
||||
vars.model_type = model_config.model_type
|
||||
except ValueError as e:
|
||||
vars.model_type = "not_found"
|
||||
elif(os.path.isdir("models/{}".format(vars.custmodpth.replace('/', '_')))):
|
||||
try:
|
||||
model_config = AutoConfig.from_pretrained("models/{}".format(vars.custmodpth.replace('/', '_')), revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained("models/{}".format(vars.custmodpth.replace('/', '_')), revision=args.revision, cache_dir="cache")
|
||||
vars.model_type = model_config.model_type
|
||||
except ValueError as e:
|
||||
vars.model_type = "not_found"
|
||||
else:
|
||||
try:
|
||||
model_config = AutoConfig.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
model_config = AutoConfig.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
vars.model_type = model_config.model_type
|
||||
except ValueError as e:
|
||||
vars.model_type = "not_found"
|
||||
|
@ -2387,6 +2384,7 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
with zipfile.ZipFile(f, "r") as z:
|
||||
try:
|
||||
last_storage_key = None
|
||||
zipfolder = os.path.basename(os.path.normpath(f)).split('.')[0]
|
||||
f = None
|
||||
current_offset = 0
|
||||
able_to_pin_layers = True
|
||||
|
@ -2398,7 +2396,10 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
last_storage_key = storage_key
|
||||
if isinstance(f, zipfile.ZipExtFile):
|
||||
f.close()
|
||||
f = z.open(f"archive/data/{storage_key}")
|
||||
try:
|
||||
f = z.open(f"archive/data/{storage_key}")
|
||||
except:
|
||||
f = z.open(f"{zipfolder}/data/{storage_key}")
|
||||
current_offset = 0
|
||||
if current_offset != model_dict[key].seek_offset:
|
||||
f.read(model_dict[key].seek_offset - current_offset)
|
||||
|
@ -2485,19 +2486,19 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
with(maybe_use_float16()):
|
||||
try:
|
||||
if os.path.exists(vars.custmodpth):
|
||||
model = GPT2LMHeadModel.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
model = GPT2LMHeadModel.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
elif os.path.exists(os.path.join("models/", vars.custmodpth)):
|
||||
model = GPT2LMHeadModel.from_pretrained(os.path.join("models/", vars.custmodpth), revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(os.path.join("models/", vars.custmodpth), revision=vars.revision, cache_dir="cache")
|
||||
model = GPT2LMHeadModel.from_pretrained(os.path.join("models/", vars.custmodpth), revision=args.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(os.path.join("models/", vars.custmodpth), revision=args.revision, cache_dir="cache")
|
||||
else:
|
||||
model = GPT2LMHeadModel.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
model = GPT2LMHeadModel.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
if("out of memory" in traceback.format_exc().lower()):
|
||||
raise RuntimeError("One of your GPUs ran out of memory when KoboldAI tried to load your model.")
|
||||
raise e
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
model.save_pretrained("models/{}".format(vars.model.replace('/', '_')), max_shard_size="500MiB")
|
||||
tokenizer.save_pretrained("models/{}".format(vars.model.replace('/', '_')))
|
||||
vars.modeldim = get_hidden_size_from_model(model)
|
||||
|
@ -2544,38 +2545,38 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
lowmem = {}
|
||||
if(os.path.isdir(vars.custmodpth)):
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
try:
|
||||
model = AutoModelForCausalLM.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = AutoModelForCausalLM.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache", **lowmem)
|
||||
except Exception as e:
|
||||
if("out of memory" in traceback.format_exc().lower()):
|
||||
raise RuntimeError("One of your GPUs ran out of memory when KoboldAI tried to load your model.")
|
||||
model = GPTNeoForCausalLM.from_pretrained(vars.custmodpth, revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = GPTNeoForCausalLM.from_pretrained(vars.custmodpth, revision=args.revision, cache_dir="cache", **lowmem)
|
||||
elif(os.path.isdir("models/{}".format(vars.model.replace('/', '_')))):
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
try:
|
||||
model = AutoModelForCausalLM.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = AutoModelForCausalLM.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=args.revision, cache_dir="cache", **lowmem)
|
||||
except Exception as e:
|
||||
if("out of memory" in traceback.format_exc().lower()):
|
||||
raise RuntimeError("One of your GPUs ran out of memory when KoboldAI tried to load your model.")
|
||||
model = GPTNeoForCausalLM.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = GPTNeoForCausalLM.from_pretrained("models/{}".format(vars.model.replace('/', '_')), revision=args.revision, cache_dir="cache", **lowmem)
|
||||
else:
|
||||
old_rebuild_tensor = torch._utils._rebuild_tensor
|
||||
def new_rebuild_tensor(storage: Union[torch_lazy_loader.LazyTensor, torch.Storage], storage_offset, shape, stride):
|
||||
|
@ -2591,21 +2592,21 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
torch._utils._rebuild_tensor = new_rebuild_tensor
|
||||
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.model, revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained(vars.model, revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
try:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained(vars.model, revision=args.revision, cache_dir="cache")
|
||||
except Exception as e:
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
try:
|
||||
model = AutoModelForCausalLM.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = AutoModelForCausalLM.from_pretrained(vars.model, revision=args.revision, cache_dir="cache", **lowmem)
|
||||
except Exception as e:
|
||||
if("out of memory" in traceback.format_exc().lower()):
|
||||
raise RuntimeError("One of your GPUs ran out of memory when KoboldAI tried to load your model.")
|
||||
model = GPTNeoForCausalLM.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache", **lowmem)
|
||||
model = GPTNeoForCausalLM.from_pretrained(vars.model, revision=args.revision, cache_dir="cache", **lowmem)
|
||||
|
||||
torch._utils._rebuild_tensor = old_rebuild_tensor
|
||||
|
||||
|
@ -2622,10 +2623,10 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
import huggingface_hub
|
||||
legacy = packaging.version.parse(transformers_version) < packaging.version.parse("4.22.0.dev0")
|
||||
# Save the config.json
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, transformers.configuration_utils.CONFIG_NAME, revision=vars.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), transformers.configuration_utils.CONFIG_NAME))
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, transformers.configuration_utils.CONFIG_NAME, revision=args.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), transformers.configuration_utils.CONFIG_NAME))
|
||||
if(utils.num_shards is None):
|
||||
# Save the pytorch_model.bin of an unsharded model
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, transformers.modeling_utils.WEIGHTS_NAME, revision=vars.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), transformers.modeling_utils.WEIGHTS_NAME))
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, transformers.modeling_utils.WEIGHTS_NAME, revision=args.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), transformers.modeling_utils.WEIGHTS_NAME))
|
||||
else:
|
||||
with open(utils.from_pretrained_index_filename) as f:
|
||||
map_data = json.load(f)
|
||||
|
@ -2634,7 +2635,7 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
shutil.move(os.path.realpath(utils.from_pretrained_index_filename), os.path.join("models/{}".format(vars.model.replace('/', '_')), transformers.modeling_utils.WEIGHTS_INDEX_NAME))
|
||||
# Then save the pytorch_model-#####-of-#####.bin files
|
||||
for filename in filenames:
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, filename, revision=vars.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), filename))
|
||||
shutil.move(os.path.realpath(huggingface_hub.hf_hub_download(vars.model, filename, revision=args.revision, cache_dir="cache", local_files_only=True, legacy_cache_layout=legacy)), os.path.join("models/{}".format(vars.model.replace('/', '_')), filename))
|
||||
shutil.rmtree("cache/")
|
||||
|
||||
if(vars.badwordsids is vars.badwordsids_default and vars.model_type not in ("gpt2", "gpt_neo", "gptj")):
|
||||
|
@ -2680,7 +2681,7 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
|
||||
else:
|
||||
from transformers import GPT2Tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
else:
|
||||
from transformers import PreTrainedModel
|
||||
from transformers import modeling_utils
|
||||
|
@ -2779,11 +2780,11 @@ def load_model(use_gpu=True, gpu_layers=None, disk_layers=None, initial_load=Fal
|
|||
# If we're running Colab or OAI, we still need a tokenizer.
|
||||
if(vars.model in ("Colab", "API", "CLUSTER")):
|
||||
from transformers import GPT2Tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B", revision=args.revision, cache_dir="cache")
|
||||
loadsettings()
|
||||
elif(vars.model == "OAI"):
|
||||
from transformers import GPT2Tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
loadsettings()
|
||||
# Load the TPU backend if requested
|
||||
elif(vars.use_colab_tpu or vars.model in ("TPUMeshTransformerGPTJ", "TPUMeshTransformerGPTNeoX")):
|
||||
|
@ -3040,7 +3041,7 @@ def lua_decode(tokens):
|
|||
if("tokenizer" not in globals()):
|
||||
from transformers import GPT2Tokenizer
|
||||
global tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
return utils.decodenewlines(tokenizer.decode(tokens))
|
||||
|
||||
#==================================================================#
|
||||
|
@ -3052,7 +3053,7 @@ def lua_encode(string):
|
|||
if("tokenizer" not in globals()):
|
||||
from transformers import GPT2Tokenizer
|
||||
global tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
return tokenizer.encode(utils.encodenewlines(string), max_length=int(4e9), truncation=True)
|
||||
|
||||
#==================================================================#
|
||||
|
@ -4201,19 +4202,19 @@ def actionsubmit(data, actionmode=0, force_submit=False, force_prompt_gen=False,
|
|||
try:
|
||||
if(os.path.isdir(tokenizer_id)):
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=args.revision, cache_dir="cache")
|
||||
except:
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
elif(os.path.isdir("models/{}".format(tokenizer_id.replace('/', '_')))):
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(tokenizer_id.replace('/', '_')), revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(tokenizer_id.replace('/', '_')), revision=args.revision, cache_dir="cache")
|
||||
except:
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(tokenizer_id.replace('/', '_')), revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained("models/{}".format(tokenizer_id.replace('/', '_')), revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
else:
|
||||
try:
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=args.revision, cache_dir="cache")
|
||||
except:
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=vars.revision, cache_dir="cache", use_fast=False)
|
||||
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id, revision=args.revision, cache_dir="cache", use_fast=False)
|
||||
except:
|
||||
logger.warning(f"Unknown tokenizer {repr(tokenizer_id)}")
|
||||
vars.api_tokenizer_id = tokenizer_id
|
||||
|
@ -4625,7 +4626,7 @@ def calcsubmitbudget(actionlen, winfo, mem, anotetxt, actions, submission=None,
|
|||
if("tokenizer" not in globals()):
|
||||
from transformers import GPT2Tokenizer
|
||||
global tokenizer
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=vars.revision, cache_dir="cache")
|
||||
tokenizer = GPT2Tokenizer.from_pretrained("gpt2", revision=args.revision, cache_dir="cache")
|
||||
|
||||
lnheader = len(tokenizer._koboldai_header)
|
||||
|
||||
|
@ -5272,15 +5273,21 @@ def sendtocluster(txt, min, max):
|
|||
cluster_metadata = {
|
||||
'prompt': txt,
|
||||
'params': reqdata,
|
||||
'api_key': vars.apikey,
|
||||
'models': vars.cluster_requested_models,
|
||||
}
|
||||
'trusted_workers': False,
|
||||
}
|
||||
client_agent = "KoboldAI:1.19.3:koboldai.org"
|
||||
cluster_headers = {
|
||||
'apikey': vars.apikey,
|
||||
"Client-Agent": client_agent
|
||||
}
|
||||
logger.debug(f"Horde Payload: {cluster_metadata}")
|
||||
try:
|
||||
# Create request
|
||||
req = requests.post(
|
||||
vars.colaburl[:-8] + "/api/v1/generate/sync",
|
||||
vars.colaburl[:-8] + "/api/v2/generate/text/async",
|
||||
json=cluster_metadata,
|
||||
headers=cluster_headers,
|
||||
)
|
||||
except requests.exceptions.ConnectionError:
|
||||
errmsg = f"Horde unavailable. Please try again later"
|
||||
|
@ -5308,13 +5315,76 @@ def sendtocluster(txt, min, max):
|
|||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
gen_servers = [(cgen['server_name'],cgen['server_id']) for cgen in js]
|
||||
logger.info(f"Generations by: {gen_servers}")
|
||||
|
||||
request_id = js["id"]
|
||||
logger.debug("Horde Request ID: {}".format(request_id))
|
||||
|
||||
cluster_agent_headers = {
|
||||
"Client-Agent": client_agent
|
||||
}
|
||||
finished = False
|
||||
|
||||
while not finished:
|
||||
try:
|
||||
req = requests.get(vars.colaburl[:-8] + "/api/v2/generate/text/status/" + request_id, headers=cluster_agent_headers)
|
||||
except requests.exceptions.ConnectionError:
|
||||
errmsg = f"Horde unavailable. Please try again later"
|
||||
logger.error(errmsg)
|
||||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
|
||||
if not req.ok:
|
||||
errmsg = f"KoboldAI API Error: Failed to get a standard reply from the Horde. Please check the console."
|
||||
logger.error(req.text)
|
||||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
|
||||
try:
|
||||
req_status = req.json()
|
||||
except requests.exceptions.JSONDecodeError:
|
||||
errmsg = f"Unexpected message received from the KoboldAI Horde: '{req.text}'"
|
||||
logger.error(errmsg)
|
||||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
|
||||
if "done" not in req_status:
|
||||
errmsg = f"Unexpected response received from the KoboldAI Horde: '{js}'"
|
||||
logger.error(errmsg)
|
||||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
|
||||
finished = req_status["done"]
|
||||
|
||||
if not finished:
|
||||
logger.debug(req_status)
|
||||
time.sleep(1)
|
||||
|
||||
logger.debug("Last Horde Status Message: {}".format(js))
|
||||
if req_status["faulted"]:
|
||||
errmsg = "Horde Text generation faulted! Please try again"
|
||||
logger.error(errmsg)
|
||||
emit('from_server', {'cmd': 'errmsg', 'data': errmsg}, broadcast=True)
|
||||
set_aibusy(0)
|
||||
return
|
||||
|
||||
generations = req_status['generations']
|
||||
gen_workers = [(cgen['worker_name'],cgen['worker_id']) for cgen in generations]
|
||||
logger.info(f"Generations by: {gen_workers}")
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Just in case we want to announce it to the user
|
||||
if len(js) == 1:
|
||||
warnmsg = f"Text generated by {js[0]['server_name']}"
|
||||
if len(generations) == 1:
|
||||
warnmsg = f"Text generated by {[w[0] for w in gen_workers]}"
|
||||
emit('from_server', {'cmd': 'warnmsg', 'data': warnmsg}, broadcast=True)
|
||||
genout = [cgen['text'] for cgen in js]
|
||||
genout = [cgen['text'] for cgen in generations]
|
||||
|
||||
for i in range(vars.numseqs):
|
||||
vars.lua_koboldbridge.outputs[i+1] = genout[i]
|
||||
|
|
237
colab/GPU.ipynb
237
colab/GPU.ipynb
|
@ -1,23 +1,4 @@
|
|||
{
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0,
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"name": "ColabKobold GPU",
|
||||
"private_outputs": true,
|
||||
"provenance": [],
|
||||
"collapsed_sections": [],
|
||||
"include_colab_link": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"accelerator": "GPU"
|
||||
},
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
|
@ -35,60 +16,99 @@
|
|||
"id": "kX9y5koxa58q"
|
||||
},
|
||||
"source": [
|
||||
"## [You can get faster generations and higher context with our Koboldcpp Notebook](https://koboldai.org/colabcpp)\n",
|
||||
"\n",
|
||||
"# Welcome to KoboldAI on Google Colab, GPU Edition!\n",
|
||||
"KoboldAI is a powerful and easy way to use a variety of AI based text generation experiences. You can use it to write stories, blog posts, play a text adventure game, use it like a chatbot and more! In some cases it might even help you with an assignment or programming task (But always make sure the information the AI mentions is correct, it loves to make stuff up).\n",
|
||||
"\n",
|
||||
"For more information about KoboldAI check our our Github readme : https://github.com/KoboldAI/KoboldAI-Client/blob/main/readme.md\n",
|
||||
"\n",
|
||||
"For the larger AI models (That are typically more coherent) check out our **[TPU edition](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/TPU.ipynb)**!"
|
||||
"---\n",
|
||||
"## How to load KoboldAI: Everything you need to know\n",
|
||||
"1. On a phone? First put your browser in desktop mode because of a Google Colab bug. Otherwise nothing will happen when you click the play button. Then tap the play button next to \"<-- Tap This if you play on Mobile\", you will see an audio player. Keep the audio player playing so Colab does not get shut down in the background.\n",
|
||||
"2. Select the desired model, you will find a description of all the available models further down the page.\n",
|
||||
"3. Click the play button next to \"<-- Select your model below and then click this to start KoboldAI\".\n",
|
||||
"4. Got a message saying no accelerator is available? Click cancel, and try again in a few minutes. If you do not manage to get a session when you frequently try again try at a different time of day, colab can be busy or your priority may have been lowered by frequent usage.\n",
|
||||
"5. After everything is done loading you will get a link that you can use to open KoboldAI. In case of Localtunnel you will also be warned that some people are abusing Localtunnel for phishing, once you acknowledge this warning you will be taken to KoboldAI's interface. If you picked Cloudflare and get a 1033 error refresh the error page after waiting one minute.\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"Further down the page you can find descriptions of the models, and tips to get the most out of your Google Colab experience.\n",
|
||||
"\n",
|
||||
"Make sure to keep this page open while you are using KoboldAI, and check back regularly to see if you got a Captcha. Failure to complete the captcha's in time can result in termination of your session or a lower priority towards the TPUs.\n",
|
||||
"\n",
|
||||
"Firefox users need to disable the enhanced tracking protection or use a different browser in order to be able to use Google Colab without errors (This is not something we can do anything about, the cookie blocker breaks the Google Drive integration because it uses different domains)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "ewkXkyiFP2Hq"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title <-- Tap this if you play on Mobile { display-mode: \"form\" }\n",
|
||||
"%%html\n",
|
||||
"<b>Press play on the music player to keep the tab alive, then start KoboldAI below (Uses only 13MB of data)</b><br/>\n",
|
||||
"<audio src=\"https://henk.tech/colabkobold/silence.m4a\" controls>"
|
||||
],
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
"<audio src=\"https://raw.githubusercontent.com/KoboldAI/KoboldAI-Client/main/colab/silence.m4a\" controls>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "lVftocpwCoYw",
|
||||
"cellView": "form"
|
||||
"cellView": "form",
|
||||
"id": "lVftocpwCoYw"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title <b><-- Select your model below and then click this to start KoboldAI</b>\n",
|
||||
"#@markdown You can find a description of the models below along with instructions on how to start KoboldAI.\n",
|
||||
"\n",
|
||||
"Model = \"Nerys 2.7B\" #@param [\"Nerys 2.7B\", \"AID 2.7B\", \"Erebus 2.7B\", \"Janeway 2.7B\", \"Picard 2.7B\", \"Horni LN 2.7B\", \"Horni 2.7B\", \"Shinen 2.7B\", \"OPT 2.7B\", \"Fairseq Dense 2.7B\", \"Neo 2.7B\"] {allow-input: true}\n",
|
||||
"Model = \"Nerys V2 6B\" #@param [\"Tiefighter 13B (United)\", \"Echidna 13B (United)\", \"HoloMax 13B (United)\", \"Emerhyst 13B (United)\", \"MythoMax 13B (United)\", \"Huginn 13B (United)\", \"Chronos 13B (United)\", \"Airoboros M2.0 13B (United)\", \"Holodeck 13B (United)\", \"Spring Dragon 13B (United)\", \"Nerys V2 6B\", \"Skein 6B\", \"Janeway 6B\", \"Adventure 6B\", \"Nerys 2.7B\", \"AID 2.7B\", \"Janeway 2.7B\", \"Picard 2.7B\", \"OPT 2.7B\", \"Fairseq Dense 2.7B\", \"Neo 2.7B\"] {allow-input: true}\n",
|
||||
"Revision = \"\" #@param [\"\"]{allow-input: true}\n",
|
||||
"Version = \"Official\" #@param [\"Official\", \"United\"] {allow-input: true}\n",
|
||||
"Provider = \"Localtunnel\" #@param [\"Localtunnel\", \"Cloudflare\"]\n",
|
||||
"use_google_drive = True #@param {type:\"boolean\"}\n",
|
||||
"Provider = \"Cloudflare\" #@param [\"Localtunnel\", \"Cloudflare\"]\n",
|
||||
"use_google_drive = True #@param {type:\"boolean\"}\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"if not os.path.isfile(\"/opt/bin/nvidia-smi\"):\n",
|
||||
" raise RuntimeError(\"⚠️Colab did not give you a GPU due to usage limits, this can take a few hours before they let you back in. Check out https://lite.koboldai.net for a free alternative (that does not provide an API link but can load KoboldAI saves and chat cards) or subscribe to Colab Pro for immediate access.⚠️\")\n",
|
||||
"\n",
|
||||
"!nvidia-smi\n",
|
||||
"from google.colab import drive\n",
|
||||
"if use_google_drive:\n",
|
||||
" drive.mount('/content/drive/')\n",
|
||||
"else:\n",
|
||||
" import os\n",
|
||||
" if not os.path.exists(\"/content/drive\"):\n",
|
||||
" os.mkdir(\"/content/drive\")\n",
|
||||
" if not os.path.exists(\"/content/drive/MyDrive/\"):\n",
|
||||
" os.mkdir(\"/content/drive/MyDrive/\")\n",
|
||||
" drive.mount('/content/drive/')\n",
|
||||
"else:\n",
|
||||
" import os\n",
|
||||
" if not os.path.exists(\"/content/drive\"):\n",
|
||||
" os.mkdir(\"/content/drive\")\n",
|
||||
" if not os.path.exists(\"/content/drive/MyDrive/\"):\n",
|
||||
" os.mkdir(\"/content/drive/MyDrive/\")\n",
|
||||
"\n",
|
||||
"if Model == \"Nerys 2.7B\":\n",
|
||||
" Model = \"KoboldAI/fairseq-dense-2.7B-Nerys\"\n",
|
||||
"if Model == \"Nerys V2 6B\":\n",
|
||||
" Model = \"KoboldAI/OPT-6B-nerys-v2\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Erebus 2.7B\":\n",
|
||||
" Model = \"KoboldAI/OPT-2.7B-Erebus\"\n",
|
||||
"elif Model == \"Skein 6B\":\n",
|
||||
" Model = \"KoboldAI/GPT-J-6B-Skein\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Janeway 6B\":\n",
|
||||
" Model = \"KoboldAI/GPT-J-6B-Janeway\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Adventure 6B\":\n",
|
||||
" Model = \"KoboldAI/GPT-J-6B-Adventure\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Shinen 6B\":\n",
|
||||
" Model = \"KoboldAI/GPT-J-6B-Shinen\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Nerys 2.7B\":\n",
|
||||
" Model = \"KoboldAI/fairseq-dense-2.7B-Nerys\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Janeway 2.7B\":\n",
|
||||
|
@ -103,18 +123,6 @@
|
|||
" Model = \"KoboldAI/GPT-Neo-2.7B-AID\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Horni LN 2.7B\":\n",
|
||||
" Model = \"KoboldAI/GPT-Neo-2.7B-Horni-LN\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Horni 2.7B\":\n",
|
||||
" Model = \"KoboldAI/GPT-Neo-2.7B-Horni\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Shinen 2.7B\":\n",
|
||||
" Model = \"KoboldAI/GPT-Neo-2.7B-Shinen\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Fairseq Dense 2.7B\":\n",
|
||||
" Model = \"KoboldAI/fairseq-dense-2.7B\"\n",
|
||||
" path = \"\"\n",
|
||||
|
@ -127,55 +135,95 @@
|
|||
" Model = \"EleutherAI/gpt-neo-2.7B\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Tiefighter 13B (United)\":\n",
|
||||
" Model = \"KoboldAI/LLaMA2-13B-Tiefighter\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Echidna 13B (United)\":\n",
|
||||
" Model = \"NeverSleep/Echidna-13b-v0.3\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Huginn 13B (United)\":\n",
|
||||
" Model = \"The-Face-Of-Goonery/Huginn-13b-v1.2\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Chronos 13B (United)\":\n",
|
||||
" Model = \"elinas/chronos-13b-v2\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Airoboros M2.0 13B (United)\":\n",
|
||||
" Model = \"jondurbin/airoboros-l2-13b-gpt4-m2.0\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Emerhyst 13B (United)\":\n",
|
||||
" Model = \"Undi95/Emerhyst-13B\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"MythoMax 13B (United)\":\n",
|
||||
" Model = \"Gryphe/MythoMax-L2-13b\"\n",
|
||||
" Revision = \"\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Spring Dragon 13B (United)\":\n",
|
||||
" Model = \"Henk717/spring-dragon\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"Holodeck 13B (United)\":\n",
|
||||
" Model = \"KoboldAI/LLAMA2-13B-Holodeck-1\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"elif Model == \"HoloMax 13B (United)\":\n",
|
||||
" Model = \"KoboldAI/LLaMA2-13B-Holomax\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
" Version = \"United\"\n",
|
||||
"\n",
|
||||
"if Provider == \"Localtunnel\":\n",
|
||||
" tunnel = \"--localtunnel yes\"\n",
|
||||
"else:\n",
|
||||
" tunnel = \"\"\n",
|
||||
"\n",
|
||||
"!wget https://koboldai.org/ckds -O - | bash /dev/stdin -m $Model -g $Version $tunnel"
|
||||
],
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
"!wget https://koboldai.org/ckds -O - | bash /dev/stdin -m $Model -g $Version $Revision $tunnel"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "Lrm840I33hkC"
|
||||
},
|
||||
"source": [
|
||||
"# GPU Edition Model Descriptions\n",
|
||||
"| Model | Style | Description |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| [Nerys](https://huggingface.co/KoboldAI/fairseq-dense-2.7B-Nerys) by Mr Seeker | Novel/Adventure | Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person. |\n",
|
||||
"| [Erebus](https://huggingface.co/KoboldAI/OPT-2.7B-Erebus) by Mr Seeker | NSFW | Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license. |\n",
|
||||
"| [Tiefighter 13B by KoboldAI](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) | Hybrid | Tiefighter 13B is a very versitile fiction Hybrid, it can write, chat and play adventure games and can also answer regular instructions (Although we do not recommend this model for factual use due to its fictional nature). This is an excellent starting model, for the best results avoid using Second person writing in your chats unless you are wanting it to become a text adventure.|\n",
|
||||
"| [Janeway](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Janeway) by Mr Seeker | Novel | Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity. |\n",
|
||||
"| [Picard](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Picard) by Mr Seeker | Novel | Picard is a model trained for SFW Novels based on Neo 2.7B. It is focused on Novel style writing without the NSFW bias. While the name suggests a sci-fi model this model is designed for Novels of a variety of genre's. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [AID](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-AID) by melastacho | Adventure | Also know as Adventure 2.7B this is a clone of the AI Dungeon Classic model and is best known for the epic wackey adventures that AI Dungeon Classic players love. |\n",
|
||||
"| [Horni LN](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni-LN) by finetune | Novel | This model is based on Horni 2.7B and retains its NSFW knowledge, but was then further biased towards SFW novel stories. If you seek a balance between a SFW Novel model and a NSFW model this model should be a good choice. |\n",
|
||||
"| [Horni](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni) by finetune | NSFW | This model is tuned on Literotica to produce a Novel style model biased towards NSFW content. Can still be used for SFW stories but will have a bias towards NSFW content. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [Shinen](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Shinen) by Mr Seeker | NSFW | Shinen is an alternative to the Horni model designed to be more explicit. If Horni is to tame for you Shinen might produce better results. While it is a Novel model it is unsuitable for SFW stories due to its heavy NSFW bias. Shinen will not hold back. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [OPT](https://huggingface.co/facebook/opt-2.7b) by Metaseq | Generic | OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes. |\n",
|
||||
"| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-2.7B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger models from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |\n",
|
||||
"| [MythoMax 13B](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) by Gryphe | Roleplay | An improved, potentially even perfected variant of MythoMix, my MythoLogic-L2 and Huginn merge using a highly experimental tensor type merge technique¹. |\n",
|
||||
"| [Holomax 13B by KoboldAI](https://huggingface.co/KoboldAI/LLaMA2-13B-Holomax) | Adventure | This is an expansion merge to the well-praised MythoMax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%). The goal of this model is to enhance story-writing capabilities while preserving the desirable traits of the MythoMax model as much as possible (It does limit chat reply length). |\n",
|
||||
"| [Airoboros 13B](https://huggingface.co/jondurbin/airoboros-13b) by Jon Durbin | Generic | This is an instruction fine-tuned llama-2 model, using synthetic instructions generated by airoboros⁵. |\n",
|
||||
"| [Emerhyst 13B](https://huggingface.co/Undi95/Emerhyst-13B) by Undi | Roleplay | An attempt using BlockMerge_Gradient to get better result. In addition, LimaRP v3 was used⁷. |\n",
|
||||
"| [Chronos 13B](https://huggingface.co/elinas/chronos-13b) by Elinas | Generic | This model is primarily focused on chat, roleplay, and storywriting, but can accomplish other tasks such as simple reasoning and coding. Chronos generates very long outputs with coherent text, largely due to the human inputs it was trained on. |\n",
|
||||
"| [Spring Dragon by Henk717](https://huggingface.co/Henk717/spring-dragon) | Adventure | This model is a recreation attempt of the AI Dungeon 2 Dragon model. To achieve this, the \"text_adventures.txt\" dataset was used, which was bundled with the original AI Dungeon 2 GitHub release prior to the online service. It is worth noting that the same dataset file was used to create the Dragon model, where Dragon is a GPT-3 175B Davinci model from 2020. |\n",
|
||||
"| [Holodeck By KoboldAI](https://huggingface.co/KoboldAI/LLAMA2-13B-Holodeck-1) | Adventure |LLAMA2 13B-Holodeck is a finetune created using Meta's llama 2 model.The training data contains around 3000 ebooks in various genres. Most parts of the dataset have been prepended using the following text: [Genre: <genre1>, <genre2>|\n",
|
||||
"| [Neo](https://huggingface.co/EleutherAI/gpt-neo-2.7B) by EleutherAI | Generic | This is the base model for all the other 2.7B models, it is best used when you have a use case that we have no other models available for, such as writing blog articles or programming. It can also be a good basis for the experience of some of the softprompts if your softprompt is not about a subject the other models cover. |\n",
|
||||
"\n",
|
||||
"# [TPU Edition Model Descriptions](https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/TPU.ipynb)\n",
|
||||
"\n",
|
||||
"| Model | Style | Description |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| [Nerys](https://huggingface.co/KoboldAI/fairseq-dense-13B-Nerys) by Mr Seeker | Novel/Adventure | Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person. |\n",
|
||||
"| [Erebus](https://huggingface.co/KoboldAI/OPT-13B-Erebus) by Mr Seeker | NSFW | Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license. |\n",
|
||||
"| [Janeway](https://huggingface.co/KoboldAI/fairseq-dense-13B-Janeway) by Mr Seeker | Novel | Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity. |\n",
|
||||
"| [Shinen](https://huggingface.co/KoboldAI/fairseq-dense-13B-Shinen) by Mr Seeker | NSFW | Shinen is an NSFW model trained on a variety of stories from the website Sexstories it contains many different kinks. It has been merged into the larger (and better) Erebus model. |\n",
|
||||
"| [Skein](https://huggingface.co/KoboldAI/GPT-J-6B-Skein) by VE\\_FORBRYDERNE | Adventure | Skein is best used with Adventure mode enabled, it consists of a 4 times larger adventure dataset than the Adventure model making it excellent for text adventure gaming. On top of that it also consists of light novel training further expanding its knowledge and writing capabilities. It can be used with the You filter bias if you wish to write Novels with it, but dedicated Novel models can perform better for this task. |\n",
|
||||
"| [Adventure](https://huggingface.co/KoboldAI/GPT-J-6B-Adventure) by VE\\_FORBRYDERNE | Adventure | Adventure is a 6B model designed to mimick the behavior of AI Dungeon. It is exclusively for Adventure Mode and can take you on the epic and wackey adventures that AI Dungeon players love. It also features the many tropes of AI Dungeon as it has been trained on very similar data. It must be used in second person (You). |\n",
|
||||
"| [Lit](https://huggingface.co/hakurei/lit-6B) ([V2](https://huggingface.co/hakurei/litv2-6B-rev3)) by Haru | NSFW | Lit is a great NSFW model trained by Haru on both a large set of Literotica stories and high quality novels along with tagging support. Creating a high quality model for your NSFW stories. This model is exclusively a novel model and is best used in third person. |\n",
|
||||
"| [OPT](https://huggingface.co/facebook/opt-13b) by Metaseq | Generic | OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes. |\n",
|
||||
"| [Neo(X)](https://huggingface.co/EleutherAI/gpt-neox-20b) by EleutherAI | Generic | NeoX is the largest EleutherAI model currently available, being a generic model it is not particularly trained towards anything and can do a variety of writing, Q&A and coding tasks. 20B's performance is closely compared to the 13B models and it is worth trying both especially if you have a task that does not involve english writing. Its behavior will be similar to the GPT-J-6B model since they are trained on the same dataset but with more sensitivity towards repetition penalty and with more knowledge. |\n",
|
||||
"| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-13B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger 20B model from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |\n",
|
||||
"| [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6B) by EleutherAI | Generic | This model serves as the basis for most other 6B models (Some being based on Fairseq Dense instead). Being trained on the Pile and not biased towards anything in particular it is suitable for a variety of tasks such as writing, Q&A and coding tasks. You will likely get better result with larger generic models or finetuned models. |\n",
|
||||
"\n",
|
||||
"| Style | Description |\n",
|
||||
"| --------- | ------------------------------------------------------------ |\n",
|
||||
"| Novel | For regular story writing, not compatible with Adventure mode or other specialty modes. |\n",
|
||||
"| NSFW | Indicates that the model is strongly biased towards NSFW content and is not suitable for children, work environments or livestreaming. Most NSFW models are also Novel models in nature. |\n",
|
||||
"| Adventure | These models are excellent for people willing to play KoboldAI like a Text Adventure game and are meant to be used with Adventure mode enabled. Even if you wish to use it as a Novel style model you should always have Adventure mode on and set it to story. These models typically have a strong bias towards the use of the word You and without Adventure mode enabled break the story flow and write actions on your behalf. |\n",
|
||||
"| Generic | Generic models are not trained towards anything specific, typically used as a basis for other tasks and models. They can do everything the other models can do, but require much more handholding to work properly. Generic models are an ideal basis for tasks that we have no specific model for, or for experiencing a softprompt in its raw form. |\n",
|
||||
"\n",
|
||||
|
@ -191,10 +239,39 @@
|
|||
"7. As you play KoboldAI, keep this Colab tab open in the background and check occationally for Captcha's so they do not shut your instance down. If you do get shut down you can always download a copy of your gamesave in the Save menu inside KoboldAI. Stories are never lost as long as you keep KoboldAI open in your browser.\n",
|
||||
"\n",
|
||||
"Get a error message saying you do not have access to a GPU/TPU instance? Do not continue and try again later, KoboldAI will not run correctly without them."
|
||||
],
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "Lrm840I33hkC"
|
||||
}
|
||||
"cellView": "form",
|
||||
"id": "5k8fK4F6UiTs"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title <b>Model Cleaner</b>\n",
|
||||
"#@markdown Out of space? Run this to remove all cached models (Google Drive models are not effected).\n",
|
||||
"!rm -rf /content/KoboldAI-Client/cache/*\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"accelerator": "GPU",
|
||||
"colab": {
|
||||
"name": "ColabKobold GPU",
|
||||
"private_outputs": true,
|
||||
"provenance": [],
|
||||
"include_colab_link": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
|
@ -46,7 +46,7 @@
|
|||
"#@title <-- Tap this if you play on Mobile { display-mode: \"form\" }\n",
|
||||
"%%html\n",
|
||||
"<b>Press play on the music player to keep the tab alive, then start KoboldAI below (Uses only 13MB of data)</b><br/>\n",
|
||||
"<audio src=\"https://henk.tech/colabkobold/silence.m4a\" controls>"
|
||||
"<audio src=\"https://raw.githubusercontent.com/KoboldAI/KoboldAI-Client/main/colab/silence.m4a\" controls>"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "ZIL7itnNaw5V"
|
||||
|
@ -66,10 +66,10 @@
|
|||
"#@title <b><-- Select your model below and then click this to start KoboldAI</b>\n",
|
||||
"#@markdown You can find a description of the models below along with instructions on how to start KoboldAI.\n",
|
||||
"\n",
|
||||
"Model = \"Nerys 13B V2\" #@param [\"Nerys 13B V2\", \"Erebus 13B\", \"Janeway 13B\", \"Shinen 13B\", \"Skein 20B\", \"Erebus 20B\", \"Skein 6B\", \"Janeway 6B\", \"Adventure 6B\", \"Shinen 6B\", \"Lit V2 6B\", \"Lit 6B\", \"NeoX 20B\", \"OPT 13B\", \"Fairseq Dense 13B\", \"GPT-J-6B\"] {allow-input: true}\n",
|
||||
"Model = \"Nerys 13B V2\" #@param [\"Nerys 13B V2\", \"Janeway 13B\", \"Skein 20B\", \"Skein 6B\", \"Janeway 6B\", \"Adventure 6B\", \"NeoX 20B\", \"OPT 13B\", \"Fairseq Dense 13B\", \"GPT-J-6B\"] {allow-input: true}\n",
|
||||
"Version = \"Official\" #@param [\"Official\", \"United\"] {allow-input: true}\n",
|
||||
"Provider = \"Localtunnel\" #@param [\"Localtunnel\", \"Cloudflare\"]\n",
|
||||
"use_google_drive = True #@param {type:\"boolean\"}\n",
|
||||
"Provider = \"Cloudflare\" #@param [\"Localtunnel\", \"Cloudflare\"]\n",
|
||||
"use_google_drive = True #@param {type:\"boolean\"}\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"try:\n",
|
||||
|
@ -81,13 +81,15 @@
|
|||
"print('Now we will need your Google Drive to store settings and saves, you must login with the same account you used for Colab.')\n",
|
||||
"from google.colab import drive\n",
|
||||
"if use_google_drive:\n",
|
||||
" drive.mount('/content/drive/')\n",
|
||||
"else:\n",
|
||||
" import os\n",
|
||||
" if not os.path.exists(\"/content/drive\"):\n",
|
||||
" os.mkdir(\"/content/drive\")\n",
|
||||
" if not os.path.exists(\"/content/drive/MyDrive/\"):\n",
|
||||
" os.mkdir(\"/content/drive/MyDrive/\")\n",
|
||||
" drive.mount('/content/drive/')\n",
|
||||
"else:\n",
|
||||
" import os\n",
|
||||
" if not os.path.exists(\"/content/drive\"):\n",
|
||||
" os.mkdir(\"/content/drive\")\n",
|
||||
" if not os.path.exists(\"/content/drive/MyDrive/\"):\n",
|
||||
" os.mkdir(\"/content/drive/MyDrive/\")\n",
|
||||
"\n",
|
||||
"Revision = \"\"\n",
|
||||
"\n",
|
||||
"if Model == \"Janeway 13B\":\n",
|
||||
" Model = \"KoboldAI/fairseq-dense-13B-Janeway\"\n",
|
||||
|
@ -97,18 +99,6 @@
|
|||
" Model = \"KoboldAI/OPT-13B-Nerys-v2\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Erebus 13B\":\n",
|
||||
" Model = \"KoboldAI/OPT-13B-Erebus\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Shinen 13B\":\n",
|
||||
" Model = \"KoboldAI/fairseq-dense-13B-Shinen\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Erebus 20B\":\n",
|
||||
" Model = \"KoboldAI/GPT-NeoX-20B-Erebus\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Skein 20B\":\n",
|
||||
" Model = \"KoboldAI/GPT-NeoX-20B-Skein\"\n",
|
||||
" path = \"\"\n",
|
||||
|
@ -129,18 +119,6 @@
|
|||
" Model = \"KoboldAI/GPT-J-6B-Adventure\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Lit V2 6B\":\n",
|
||||
" Model = \"hakurei/litv2-6B-rev3\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Lit 6B\":\n",
|
||||
" Model = \"hakurei/lit-6B\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"Shinen 6B\":\n",
|
||||
" Model = \"KoboldAI/GPT-J-6B-Shinen\"\n",
|
||||
" path = \"\"\n",
|
||||
" download = \"\"\n",
|
||||
"elif Model == \"OPT 13B\":\n",
|
||||
" Model = \"facebook/opt-13b\"\n",
|
||||
" path = \"\"\n",
|
||||
|
@ -162,7 +140,7 @@
|
|||
"else:\n",
|
||||
" tunnel = \"\"\n",
|
||||
"\n",
|
||||
"!wget https://koboldai.org/ckds -O - | bash /dev/stdin $path$download -m $Model -g $Version $tunnel"
|
||||
"!wget https://koboldai.org/ckds -O - | bash /dev/stdin $path$download -m $Model -g $Version $tunnel $Revision"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -173,12 +151,9 @@
|
|||
"| Model | Style | Description |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| [Nerys](https://huggingface.co/KoboldAI/fairseq-dense-13B-Nerys) by Mr Seeker | Novel/Adventure | Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person. |\n",
|
||||
"| [Erebus](https://huggingface.co/KoboldAI/OPT-13B-Erebus) by Mr Seeker | NSFW | Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license. |\n",
|
||||
"| [Janeway](https://huggingface.co/KoboldAI/fairseq-dense-13B-Janeway) by Mr Seeker | Novel | Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity. |\n",
|
||||
"| [Shinen](https://huggingface.co/KoboldAI/fairseq-dense-13B-Shinen) by Mr Seeker | NSFW | Shinen is an NSFW model trained on a variety of stories from the website Sexstories it contains many different kinks. It has been merged into the larger (and better) Erebus model. |\n",
|
||||
"| [Skein](https://huggingface.co/KoboldAI/GPT-J-6B-Skein) by VE\\_FORBRYDERNE | Adventure | Skein is best used with Adventure mode enabled, it consists of a 4 times larger adventure dataset than the Adventure model making it excellent for text adventure gaming. On top of that it also consists of light novel training further expanding its knowledge and writing capabilities. It can be used with the You filter bias if you wish to write Novels with it, but dedicated Novel models can perform better for this task. |\n",
|
||||
"| [Adventure](https://huggingface.co/KoboldAI/GPT-J-6B-Adventure) by VE\\_FORBRYDERNE | Adventure | Adventure is a 6B model designed to mimick the behavior of AI Dungeon. It is exclusively for Adventure Mode and can take you on the epic and wackey adventures that AI Dungeon players love. It also features the many tropes of AI Dungeon as it has been trained on very similar data. It must be used in second person (You). |\n",
|
||||
"| [Lit](https://huggingface.co/hakurei/lit-6B) ([V2](https://huggingface.co/hakurei/litv2-6B-rev3)) by Haru | NSFW | Lit is a great NSFW model trained by Haru on both a large set of Literotica stories and high quality novels along with tagging support. Creating a high quality model for your NSFW stories. This model is exclusively a novel model and is best used in third person. |\n",
|
||||
"| [OPT](https://huggingface.co/facebook/opt-13b) by Metaseq | Generic | OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes. |\n",
|
||||
"| [Neo(X)](https://huggingface.co/EleutherAI/gpt-neox-20b) by EleutherAI | Generic | NeoX is the largest EleutherAI model currently available, being a generic model it is not particularly trained towards anything and can do a variety of writing, Q&A and coding tasks. 20B's performance is closely compared to the 13B models and it is worth trying both especially if you have a task that does not involve english writing. Its behavior will be similar to the GPT-J-6B model since they are trained on the same dataset but with more sensitivity towards repetition penalty and with more knowledge. |\n",
|
||||
"| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-13B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger 20B model from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |\n",
|
||||
|
@ -189,13 +164,9 @@
|
|||
"| Model | Style | Description |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| [Nerys](https://huggingface.co/KoboldAI/fairseq-dense-2.7B-Nerys) by Mr Seeker | Novel/Adventure | Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person. |\n",
|
||||
"| [Erebus](https://huggingface.co/KoboldAI/OPT-2.7B-Erebus) by Mr Seeker | NSFW | Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license. |\n",
|
||||
"| [Janeway](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Janeway) by Mr Seeker | Novel | Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity. |\n",
|
||||
"| [Picard](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Picard) by Mr Seeker | Novel | Picard is a model trained for SFW Novels based on Neo 2.7B. It is focused on Novel style writing without the NSFW bias. While the name suggests a sci-fi model this model is designed for Novels of a variety of genre's. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [AID](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-AID) by melastacho | Adventure | Also know as Adventure 2.7B this is a clone of the AI Dungeon Classic model and is best known for the epic wackey adventures that AI Dungeon Classic players love. |\n",
|
||||
"| [Horni LN](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni-LN) by finetune | Novel | This model is based on Horni 2.7B and retains its NSFW knowledge, but was then further biased towards SFW novel stories. If you seek a balance between a SFW Novel model and a NSFW model this model should be a good choice. |\n",
|
||||
"| [Horni](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Horni) by finetune | NSFW | This model is tuned on Literotica to produce a Novel style model biased towards NSFW content. Can still be used for SFW stories but will have a bias towards NSFW content. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [Shinen](https://huggingface.co/KoboldAI/GPT-Neo-2.7B-Shinen) by Mr Seeker | NSFW | Shinen is an alternative to the Horni model designed to be more explicit. If Horni is to tame for you Shinen might produce better results. While it is a Novel model it is unsuitable for SFW stories due to its heavy NSFW bias. Shinen will not hold back. It is meant to be used in KoboldAI's regular mode. |\n",
|
||||
"| [OPT](https://huggingface.co/facebook/opt-2.7b) by Metaseq | Generic | OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes. |\n",
|
||||
"| [Fairseq Dense](https://huggingface.co/KoboldAI/fairseq-dense-2.7B) | Generic | Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger models from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else. |\n",
|
||||
"| [Neo](https://huggingface.co/EleutherAI/gpt-neo-2.7B) by EleutherAI | Generic | This is the base model for all the other 2.7B models, it is best used when you have a use case that we have no other models available for, such as writing blog articles or programming. It can also be a good basis for the experience of some of the softprompts if your softprompt is not about a subject the other models cover. |\n",
|
||||
|
@ -204,7 +175,6 @@
|
|||
"| Style | Description |\n",
|
||||
"| --- | --- |\n",
|
||||
"| Novel | For regular story writing, not compatible with Adventure mode or other specialty modes. |\n",
|
||||
"| NSFW | Indicates that the model is strongly biased towards NSFW content and is not suitable for children, work environments or livestreaming. Most NSFW models are also Novel models in nature. |\n",
|
||||
"| Adventure | These models are excellent for people willing to play KoboldAI like a Text Adventure game and are meant to be used with Adventure mode enabled. Even if you wish to use it as a Novel style model you should always have Adventure mode on and set it to story. These models typically have a strong bias towards the use of the word You and without Adventure mode enabled break the story flow and write actions on your behalf. |\n",
|
||||
"| Generic | Generic models are not trained towards anything specific, typically used as a basis for other tasks and models. They can do everything the other models can do, but require much more handholding to work properly. Generic models are an ideal basis for tasks that we have no specific model for, or for experiencing a softprompt in its raw form. |\n",
|
||||
"\n",
|
||||
|
@ -240,7 +210,6 @@
|
|||
"name": "ColabKobold TPU",
|
||||
"provenance": [],
|
||||
"private_outputs": true,
|
||||
"collapsed_sections": [],
|
||||
"include_colab_link": true
|
||||
},
|
||||
"kernelspec": {
|
||||
|
|
Binary file not shown.
|
@ -6,4 +6,4 @@ WORKDIR /content/
|
|||
COPY env.yml /home/micromamba/env.yml
|
||||
RUN micromamba install -y -n base -f /home/micromamba/env.yml
|
||||
USER root
|
||||
RUN apt update && apt install xorg -y
|
||||
RUN apt update && apt install xorg aria2 -y
|
||||
|
|
|
@ -5,6 +5,8 @@ services:
|
|||
environment:
|
||||
- DISPLAY=${DISPLAY}
|
||||
network_mode: "host"
|
||||
security_opt:
|
||||
- label:disable
|
||||
volumes:
|
||||
- /tmp/.X11-unix:/tmp/.X11-unix
|
||||
- /etc/protocols:/etc/protocols:ro
|
||||
|
|
|
@ -3,4 +3,4 @@ WORKDIR /content/
|
|||
COPY env.yml /home/micromamba/env.yml
|
||||
RUN micromamba install -y -n base -f /home/micromamba/env.yml
|
||||
USER root
|
||||
RUN apt update && apt install xorg libsqlite3-0 -y
|
||||
RUN apt update && apt install xorg libsqlite3-0 aria2 -y
|
||||
|
|
|
@ -5,6 +5,8 @@ services:
|
|||
environment:
|
||||
- DISPLAY=${DISPLAY}
|
||||
network_mode: "host"
|
||||
security_opt:
|
||||
- label:disable
|
||||
volumes:
|
||||
- /tmp/.X11-unix:/tmp/.X11-unix
|
||||
- /etc/protocols:/etc/protocols:ro
|
||||
|
|
|
@ -9,7 +9,7 @@ if [[ ! -v KOBOLDAI_DATADIR ]];then
|
|||
fi
|
||||
|
||||
mkdir $KOBOLDAI_DATADIR/stories
|
||||
if [[ ! -v KOBOLDAI_MODELDIR ]];then
|
||||
if [[ -v KOBOLDAI_MODELDIR ]];then
|
||||
mkdir $KOBOLDAI_MODELDIR/models
|
||||
fi
|
||||
mkdir $KOBOLDAI_DATADIR/settings
|
||||
|
@ -28,7 +28,7 @@ rm -rf userscripts/
|
|||
rm softprompts
|
||||
rm -rf softprompts/
|
||||
|
||||
if [[ ! -v KOBOLDAI_MODELDIR ]];then
|
||||
if [[ -v KOBOLDAI_MODELDIR ]];then
|
||||
rm models
|
||||
rm -rf models/
|
||||
#rm cache
|
||||
|
@ -39,7 +39,7 @@ ln -s $KOBOLDAI_DATADIR/stories/ stories
|
|||
ln -s $KOBOLDAI_DATADIR/settings/ settings
|
||||
ln -s $KOBOLDAI_DATADIR/softprompts/ softprompts
|
||||
ln -s $KOBOLDAI_DATADIR/userscripts/ userscripts
|
||||
if [[ ! -v KOBOLDAI_MODELDIR ]];then
|
||||
if [[ -v KOBOLDAI_MODELDIR ]];then
|
||||
ln -s $KOBOLDAI_MODELDIR/models/ models
|
||||
#ln -s $KOBOLDAI_MODELDIR/cache/ cache
|
||||
fi
|
||||
|
|
|
@ -5,12 +5,15 @@ channels:
|
|||
- defaults
|
||||
dependencies:
|
||||
- colorama
|
||||
- flask-socketio
|
||||
- flask-session
|
||||
- flask=2.2.3
|
||||
- flask-socketio=5.3.2
|
||||
- flask-session=0.4.0
|
||||
- python-socketio=5.7.2
|
||||
- pytorch=1.11.*
|
||||
- python=3.8.*
|
||||
- cudatoolkit=11.1
|
||||
- eventlet
|
||||
- eventlet=0.33.3
|
||||
- dnspython=2.2.1
|
||||
- markdown
|
||||
- bleach=4.1.0
|
||||
- pip
|
||||
|
@ -23,10 +26,12 @@ dependencies:
|
|||
- termcolor
|
||||
- psutil
|
||||
- pip:
|
||||
- flask-cloudflared
|
||||
- flask-cloudflared==0.0.10
|
||||
- flask-ngrok
|
||||
- Werkzeug==2.3.7
|
||||
- lupa==1.10
|
||||
- transformers>=4.20.1
|
||||
- huggingface_hub>=0.10.1
|
||||
- transformers==4.24.0
|
||||
- huggingface_hub==0.12.1
|
||||
- safetensors
|
||||
- accelerate
|
||||
- git+https://github.com/VE-FORBRYDERNE/mkultra
|
||||
|
|
|
@ -4,10 +4,13 @@ channels:
|
|||
- defaults
|
||||
dependencies:
|
||||
- colorama
|
||||
- flask-socketio
|
||||
- flask-session
|
||||
- flask=2.2.3
|
||||
- flask-socketio=5.3.2
|
||||
- flask-session=0.4.0
|
||||
- python-socketio=5.7.2
|
||||
- python=3.8.*
|
||||
- eventlet
|
||||
- eventlet=0.33.3
|
||||
- dnspython=2.2.1
|
||||
- markdown
|
||||
- bleach=4.1.0
|
||||
- pip
|
||||
|
@ -21,12 +24,13 @@ dependencies:
|
|||
- psutil
|
||||
- pip:
|
||||
- --extra-index-url https://download.pytorch.org/whl/rocm5.1.1
|
||||
- torch
|
||||
- torchvision
|
||||
- flask-cloudflared
|
||||
- torch==1.12.1+rocm5.1.1
|
||||
- flask-cloudflared==0.0.10
|
||||
- flask-ngrok
|
||||
- Werkzeug==2.3.7
|
||||
- lupa==1.10
|
||||
- transformers>=4.20.1
|
||||
- huggingface_hub>=0.10.1
|
||||
- transformers==4.24.0
|
||||
- huggingface_hub==0.12.1
|
||||
- safetensors
|
||||
- accelerate
|
||||
- git+https://github.com/VE-FORBRYDERNE/mkultra
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
#!/bin/bash
|
||||
if [[ $1 = "cuda" ]]; then
|
||||
if [[ $1 = "cuda" || $1 = "CUDA" ]]; then
|
||||
wget -qO- https://micromamba.snakepit.net/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
|
||||
bin/micromamba create -f environments/huggingface.yml -r runtime -n koboldai -y
|
||||
# Weird micromamba bug causes it to fail the first time, running it twice just to be safe, the second time is much faster
|
||||
bin/micromamba create -f environments/huggingface.yml -r runtime -n koboldai -y
|
||||
exit
|
||||
fi
|
||||
if [[ $1 = "rocm" ]]; then
|
||||
if [[ $1 = "rocm" || $1 = "ROCM" ]]; then
|
||||
wget -qO- https://micromamba.snakepit.net/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
|
||||
bin/micromamba create -f environments/rocm.yml -r runtime -n koboldai-rocm -y
|
||||
# Weird micromamba bug causes it to fail the first time, running it twice just to be safe, the second time is much faster
|
||||
|
|
|
@ -1,21 +1,25 @@
|
|||
transformers>=4.20.1
|
||||
huggingface_hub>=0.10.1
|
||||
Flask
|
||||
Flask-SocketIO
|
||||
transformers==4.24.0
|
||||
huggingface_hub==0.12.1
|
||||
Flask==2.2.3
|
||||
Flask-SocketIO==5.3.2
|
||||
Werkzeug==2.3.7
|
||||
python-socketio==5.7.2
|
||||
requests
|
||||
torch >= 1.9, < 1.13
|
||||
flask-cloudflared
|
||||
flask-cloudflared==0.0.10
|
||||
flask-ngrok
|
||||
eventlet
|
||||
eventlet==0.33.3
|
||||
dnspython==2.2.1
|
||||
lupa==1.10
|
||||
markdown
|
||||
bleach==4.1.0
|
||||
sentencepiece
|
||||
protobuf
|
||||
accelerate
|
||||
flask-session
|
||||
flask-session==0.4.0
|
||||
marshmallow>=3.13
|
||||
apispec-webframeworks
|
||||
loguru
|
||||
termcolor
|
||||
safetensors
|
||||
git+https://github.com/VE-FORBRYDERNE/mkultra
|
|
@ -2,22 +2,26 @@ torch >= 1.9, < 1.13
|
|||
numpy
|
||||
tqdm
|
||||
requests
|
||||
dm-haiku == 0.0.5
|
||||
jax == 0.2.21
|
||||
jaxlib >= 0.1.69, <= 0.3.7
|
||||
transformers >= 4.20.1
|
||||
huggingface_hub >= 0.10.1
|
||||
dm-haiku==0.0.9
|
||||
jax==0.3.25
|
||||
jaxlib==0.3.25
|
||||
chex == 0.1.5
|
||||
transformers == 4.24.0
|
||||
huggingface_hub==0.12.1
|
||||
progressbar2
|
||||
git+https://github.com/VE-FORBRYDERNE/mesh-transformer-jax@ck
|
||||
flask
|
||||
Flask-SocketIO
|
||||
flask-cloudflared >= 0.0.5
|
||||
Flask==2.2.3
|
||||
Flask-SocketIO==5.3.2
|
||||
python-socketio==5.7.2
|
||||
flask-cloudflared==0.0.10
|
||||
flask-ngrok
|
||||
eventlet
|
||||
Werkzeug==2.3.7
|
||||
eventlet==0.33.3
|
||||
dnspython==2.2.1
|
||||
lupa==1.10
|
||||
markdown
|
||||
bleach==4.1.0
|
||||
flask-session
|
||||
flask-session==0.4.0
|
||||
marshmallow>=3.13
|
||||
apispec-webframeworks
|
||||
loguru
|
||||
|
|
|
@ -3492,28 +3492,26 @@ $(document).ready(function(){
|
|||
|
||||
// Shortcuts
|
||||
$(window).keydown(function (ev) {
|
||||
// Only ctrl prefixed (for now)
|
||||
if (!ev.ctrlKey) return;
|
||||
|
||||
let handled = true;
|
||||
switch (ev.key) {
|
||||
// Ctrl+Z - Back
|
||||
case "z":
|
||||
button_actback.click();
|
||||
break;
|
||||
// Ctrl+Y - Forward
|
||||
case "y":
|
||||
button_actfwd.click();
|
||||
break;
|
||||
// Ctrl+E - Retry
|
||||
case "e":
|
||||
button_actretry.click();
|
||||
break;
|
||||
default:
|
||||
handled = false;
|
||||
if (ev.altKey)
|
||||
switch (ev.key) {
|
||||
// Alt+Z - Back
|
||||
case "z":
|
||||
button_actback.click();
|
||||
break;
|
||||
// Alt+Y - Forward
|
||||
case "y":
|
||||
button_actfwd.click();
|
||||
break;
|
||||
// Alt+R - Retry
|
||||
case "r":
|
||||
button_actretry.click();
|
||||
break;
|
||||
default:
|
||||
return;
|
||||
} else {
|
||||
return;
|
||||
}
|
||||
|
||||
if (handled) ev.preventDefault();
|
||||
ev.preventDefault();
|
||||
});
|
||||
|
||||
$("#anotetemplate").on("input", function() {
|
||||
|
@ -3796,4 +3794,4 @@ function getSelectedOptions(element) {
|
|||
output.push(item.value);
|
||||
}
|
||||
return output;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -54,6 +54,7 @@ import numpy as np
|
|||
import collections
|
||||
import _codecs
|
||||
import utils
|
||||
import os
|
||||
from torch.nn import Module
|
||||
from typing import Any, Callable, Dict, Optional, Tuple, Type, Union
|
||||
|
||||
|
@ -93,12 +94,16 @@ class LazyTensor:
|
|||
def __repr__(self):
|
||||
return self.__view(repr)
|
||||
|
||||
def materialize(self, checkpoint: Union[zipfile.ZipFile, zipfile.ZipExtFile], map_location=None, no_grad=True) -> torch.Tensor:
|
||||
def materialize(self, checkpoint: Union[zipfile.ZipFile, zipfile.ZipExtFile], map_location=None, no_grad=True, filename="pytorch_model.bin") -> torch.Tensor:
|
||||
filename = os.path.basename(os.path.normpath(filename)).split('.')[0]
|
||||
size = reduce(lambda x, y: x * y, self.shape, 1)
|
||||
dtype = self.dtype
|
||||
nbytes = size if dtype is torch.bool else size * ((torch.finfo if dtype.is_floating_point else torch.iinfo)(dtype).bits >> 3)
|
||||
if isinstance(checkpoint, zipfile.ZipFile):
|
||||
f = checkpoint.open(f"archive/data/{self.key}", "r")
|
||||
try:
|
||||
f = checkpoint.open(f"archive/data/{self.key}", "r")
|
||||
except:
|
||||
f = checkpoint.open(f"{filename}/data/{self.key}", "r")
|
||||
f.read(self.seek_offset)
|
||||
else:
|
||||
f = checkpoint
|
||||
|
|
|
@ -1049,7 +1049,7 @@ def read_neox_checkpoint(state, path, config, checkpoint_shards=2):
|
|||
raise RuntimeError(error)
|
||||
|
||||
|
||||
def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpoint=False, **kwargs) -> None:
|
||||
def load_model(path: str, driver_version="tpu_driver_20221109", hf_checkpoint=False, socketio_queue=None, initial_load=False, logger=None, **kwargs) -> None:
|
||||
global thread_resources_env, seq, tokenizer, network, params, pad_token_id
|
||||
|
||||
if "pad_token_id" in kwargs:
|
||||
|
@ -1149,7 +1149,8 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
params[param] = default_params[param]
|
||||
|
||||
# Use an optimization that will allow us to avoid one extra transpose operation
|
||||
params["transposed_linear"] = True
|
||||
if hf_checkpoint:
|
||||
params["transposed_linear"] = True
|
||||
|
||||
# Load tokenizer
|
||||
if vars.model == "TPUMeshTransformerGPTNeoX":
|
||||
|
@ -1194,10 +1195,6 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
thread_resources_env = maps.ResourceEnv(maps.Mesh(devices, ('dp', 'mp')), ())
|
||||
maps.thread_resources.env = thread_resources_env
|
||||
|
||||
global shard_xmap, batch_xmap
|
||||
shard_xmap = __shard_xmap()
|
||||
batch_xmap = __batch_xmap(shard_dim=cores_per_replica)
|
||||
|
||||
global badwords
|
||||
# These are the tokens that we don't want the AI to ever write
|
||||
badwords = jnp.array(vars.badwordsids).squeeze()
|
||||
|
@ -1243,6 +1240,7 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
from tqdm.auto import tqdm
|
||||
import functools
|
||||
|
||||
|
||||
def callback(model_dict, f, **_):
|
||||
if callback.nested:
|
||||
return
|
||||
|
@ -1250,6 +1248,7 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
with zipfile.ZipFile(f, "r") as z:
|
||||
try:
|
||||
last_storage_key = None
|
||||
zipfolder = os.path.basename(os.path.normpath(f)).split('.')[0]
|
||||
f = None
|
||||
current_offset = 0
|
||||
if utils.current_shard == 0:
|
||||
|
@ -1282,7 +1281,10 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
last_storage_key = storage_key
|
||||
if isinstance(f, zipfile.ZipExtFile):
|
||||
f.close()
|
||||
f = z.open(f"archive/data/{storage_key}")
|
||||
try:
|
||||
f = z.open(f"archive/data/{storage_key}")
|
||||
except:
|
||||
f = z.open(f"{zipfolder}/data/{storage_key}")
|
||||
current_offset = 0
|
||||
if current_offset != model_dict[key].seek_offset:
|
||||
f.read(model_dict[key].seek_offset - current_offset)
|
||||
|
@ -1312,19 +1314,20 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
#if "no_transpose" not in transforms and tensor.ndim == 2:
|
||||
# tensor = tensor.T
|
||||
tensor.unsqueeze_(0)
|
||||
if tensor.dtype is torch.float16 or tensor.dtype is torch.float32:
|
||||
tensor = tensor.bfloat16()
|
||||
|
||||
|
||||
# Shard the tensor so that parts of the tensor can be used
|
||||
# on different TPU cores
|
||||
tensor = reshard_reverse(
|
||||
tensor,
|
||||
params["cores_per_replica"],
|
||||
network.state["params"][spec["module"]][spec["param"]].shape,
|
||||
)
|
||||
tensor = jnp.array(tensor.detach())
|
||||
if tensor.dtype is torch.float16 or tensor.dtype is torch.float32:
|
||||
tensor = tensor.bfloat16()
|
||||
network.state["params"][spec["module"]][spec["param"]] = move_xmap(
|
||||
jax.dlpack.from_dlpack(torch.utils.dlpack.to_dlpack(
|
||||
reshard_reverse(
|
||||
tensor,
|
||||
params["cores_per_replica"],
|
||||
network.state["params"][spec["module"]][spec["param"]].shape,
|
||||
)
|
||||
)).copy(),
|
||||
tensor,
|
||||
np.empty(params["cores_per_replica"]),
|
||||
)
|
||||
|
||||
|
@ -1411,3 +1414,6 @@ def load_model(path: str, driver_version="tpu_driver0.1_dev20210607", hf_checkpo
|
|||
model = GPTNeoForCausalLM.from_pretrained(vars.model, revision=vars.revision, cache_dir="cache")
|
||||
|
||||
#network.state = network.move_xmap(network.state, np.zeros(cores_per_replica))
|
||||
global shard_xmap, batch_xmap
|
||||
shard_xmap = __shard_xmap()
|
||||
batch_xmap = __batch_xmap(shard_dim=cores_per_replica)
|
||||
|
|
16
utils.py
16
utils.py
|
@ -261,7 +261,7 @@ def _transformers22_aria2_hook(pretrained_model_name_or_path: str, force_downloa
|
|||
if token is None:
|
||||
raise EnvironmentError("You specified use_auth_token=True, but a huggingface token was not found.")
|
||||
_cache_dir = str(cache_dir) if cache_dir is not None else transformers.TRANSFORMERS_CACHE
|
||||
_revision = revision if revision is not None else huggingface_hub.constants.DEFAULT_REVISION
|
||||
_revision = args.revision if args.revision is not None else huggingface_hub.constants.DEFAULT_REVISION
|
||||
sharded = False
|
||||
headers = {"user-agent": transformers.file_utils.http_user_agent(user_agent)}
|
||||
if use_auth_token:
|
||||
|
@ -272,7 +272,7 @@ def _transformers22_aria2_hook(pretrained_model_name_or_path: str, force_downloa
|
|||
|
||||
def is_cached(filename):
|
||||
try:
|
||||
huggingface_hub.hf_hub_download(pretrained_model_name_or_path, filename, cache_dir=cache_dir, local_files_only=True)
|
||||
huggingface_hub.hf_hub_download(pretrained_model_name_or_path, filename, cache_dir=cache_dir, local_files_only=True, revision=_revision)
|
||||
except ValueError:
|
||||
return False
|
||||
return True
|
||||
|
@ -281,7 +281,7 @@ def _transformers22_aria2_hook(pretrained_model_name_or_path: str, force_downloa
|
|||
filename = transformers.modeling_utils.WEIGHTS_INDEX_NAME if sharded else transformers.modeling_utils.WEIGHTS_NAME
|
||||
except AttributeError:
|
||||
return
|
||||
url = huggingface_hub.hf_hub_url(pretrained_model_name_or_path, filename, revision=revision)
|
||||
url = huggingface_hub.hf_hub_url(pretrained_model_name_or_path, filename, revision=_revision)
|
||||
if is_cached(filename) or requests.head(url, allow_redirects=True, proxies=proxies, headers=headers):
|
||||
break
|
||||
if sharded:
|
||||
|
@ -295,7 +295,7 @@ def _transformers22_aria2_hook(pretrained_model_name_or_path: str, force_downloa
|
|||
with open(map_filename) as f:
|
||||
map_data = json.load(f)
|
||||
filenames = set(map_data["weight_map"].values())
|
||||
urls = [huggingface_hub.hf_hub_url(pretrained_model_name_or_path, n, revision=revision) for n in filenames]
|
||||
urls = [huggingface_hub.hf_hub_url(pretrained_model_name_or_path, n, revision=_revision) for n in filenames]
|
||||
if not force_download:
|
||||
urls = [u for u, n in zip(urls, filenames) if not is_cached(n)]
|
||||
if not urls:
|
||||
|
@ -460,6 +460,7 @@ def aria2_hook(pretrained_model_name_or_path: str, force_download=False, cache_d
|
|||
import transformers
|
||||
import transformers.modeling_utils
|
||||
from huggingface_hub import HfFolder
|
||||
_revision = args.revision if args.revision is not None else huggingface_hub.constants.DEFAULT_REVISION
|
||||
if shutil.which("aria2c") is None: # Don't do anything if aria2 is not installed
|
||||
return
|
||||
if local_files_only: # If local_files_only is true, we obviously don't need to download anything
|
||||
|
@ -494,7 +495,7 @@ def aria2_hook(pretrained_model_name_or_path: str, force_download=False, cache_d
|
|||
filename = transformers.modeling_utils.WEIGHTS_INDEX_NAME if sharded else transformers.modeling_utils.WEIGHTS_NAME
|
||||
except AttributeError:
|
||||
return
|
||||
url = huggingface_hub.hf_hub_url(pretrained_model_name_or_path, filename, revision=revision)
|
||||
url = huggingface_hub.hf_hub_url(pretrained_model_name_or_path, filename, revision=_revision)
|
||||
if is_cached(url) or requests.head(url, allow_redirects=True, proxies=proxies, headers=headers):
|
||||
break
|
||||
if sharded:
|
||||
|
@ -508,7 +509,7 @@ def aria2_hook(pretrained_model_name_or_path: str, force_download=False, cache_d
|
|||
with open(map_filename) as f:
|
||||
map_data = json.load(f)
|
||||
filenames = set(map_data["weight_map"].values())
|
||||
urls = [huggingface_hub.hf_hub_url(pretrained_model_name_or_path, n, revision=revision) for n in filenames]
|
||||
urls = [huggingface_hub.hf_hub_url(pretrained_model_name_or_path, n, revision=_revision) for n in filenames]
|
||||
if not force_download:
|
||||
urls = [u for u in urls if not is_cached(u)]
|
||||
if not urls:
|
||||
|
@ -555,7 +556,8 @@ def get_num_shards(filename):
|
|||
def get_sharded_checkpoint_num_tensors(pretrained_model_name_or_path, filename, cache_dir=None, force_download=False, proxies=None, resume_download=False, local_files_only=False, use_auth_token=None, user_agent=None, revision=None, **kwargs):
|
||||
import transformers.modeling_utils
|
||||
import torch
|
||||
shard_paths, _ = transformers.modeling_utils.get_checkpoint_shard_files(pretrained_model_name_or_path, filename, cache_dir=cache_dir, force_download=force_download, proxies=proxies, resume_download=resume_download, local_files_only=local_files_only, use_auth_token=use_auth_token, user_agent=user_agent, revision=revision)
|
||||
_revision = args.revision if args.revision is not None else huggingface_hub.constants.DEFAULT_REVISION
|
||||
shard_paths, _ = transformers.modeling_utils.get_checkpoint_shard_files(pretrained_model_name_or_path, filename, cache_dir=cache_dir, force_download=force_download, proxies=proxies, resume_download=resume_download, local_files_only=local_files_only, use_auth_token=use_auth_token, user_agent=user_agent, revision=_revision)
|
||||
return list(itertools.chain(*(torch.load(p, map_location="cpu").keys() for p in shard_paths)))
|
||||
|
||||
#==================================================================#
|
||||
|
|
Loading…
Reference in New Issue