Merge pull request #65 from henk717/united

The Big United Update! - 0.16 made by the KoboldAI community
Huge thanks to everyone who contributed to this update! =)
This commit is contained in:
KoboldAI Dev 2021-09-21 14:09:35 -04:00 committed by GitHub
commit a454e7547f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
23 changed files with 2710 additions and 473 deletions

14
.gitignore vendored
View File

@ -3,4 +3,16 @@ client.settings
# Ignore stories file except for test_story
stories/*
!stories/sample_story.json
settings/*
!stories/sample_story.json
/.project
*.bak
miniconda3/*
*.settings
__pycache__
# Ignore PyCharm project files.
.idea
# Ignore compiled Python files.
*.pyc

View File

@ -1,3 +0,0 @@
If you use Google Colab to run your models, and you made a local copy of the Colab notebook in Google Drive instead of using the community notebook, you MUST make a new copy of the community notebook to use the new multiple-sequence generation feature. The link is below:
https://colab.research.google.com/drive/1uGe9f4ruIQog3RLxfUsoThakvLpHjIkX?usp=sharing

File diff suppressed because it is too large Load Diff

489
breakmodel.py Normal file
View File

@ -0,0 +1,489 @@
'''
This is a MODIFIED version of arrmansa's low VRAM patch.
https://github.com/arrmansa/Basic-UI-for-GPT-J-6B-with-low-vram/blob/main/GPT-J-6B-Low-Vram-UI.ipynb
Copyright 2021 arrmansa
Copyright 2021 finetuneanon
Copyright 2018 The Hugging Face team
Released under the Apache License 2.0
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
'''
import torch
import torch.cuda.comm
import copy
import gc
from transformers.modeling_outputs import BaseModelOutputWithPast
from transformers.utils import logging
logger = logging.get_logger(__name__)
class MaxSharedRamBlocksException(Exception):
def __init__(self, i: int):
self.corrected_max_shared_ram_blocks = i
super().__init__('max_shared_ram_blocks is set too high, please set it to '+str(i))
breakmodel = True
gpu_device = 'cuda'
total_blocks = 24
ram_blocks = 7
max_shared_ram_blocks = None
def new_forward(
self,
input_ids=None,
past_key_values=None,
attention_mask=None,
token_type_ids=None,
position_ids=None,
head_mask=None,
inputs_embeds=None,
use_cache=None,
output_attentions=None,
output_hidden_states=None,
return_dict=None,
embs=None,
):
global max_shared_ram_blocks
if breakmodel:
if max_shared_ram_blocks is None:
max_shared_ram_blocks = total_blocks
if not hasattr(self, 'extrastorage'):
setattr(self,"extrastorage",{})
torch.cuda.empty_cache()
for i in range(ram_blocks,len(self.h)):
self.h[i].to(gpu_device)
for i in range(ram_blocks):
self.h[i].to("cpu")
self.extrastorage[i] = copy.deepcopy(self.h[i])
smalltensor = torch.tensor(0).to(gpu_device)
for param1 in self.h[i].parameters():
param1.data = smalltensor
self.h[i].to(gpu_device)
for i in range(len(self.h)):
for param in self.h[i].parameters():
param.requires_grad = False
param.data = param.data.detach()
gc.collect()
torch.cuda.empty_cache()
for i in range(ram_blocks):
for param in self.extrastorage[i].parameters():
param.requires_grad = False
if i < max_shared_ram_blocks:
try:
param.data = param.data.detach().pin_memory()
except:
raise MaxSharedRamBlocksException(i)
else:
param.data = param.data.detach()
gc.collect()
torch.cuda.empty_cache()
if ram_blocks:
for param1,param2 in zip(self.h[0].parameters(),self.extrastorage[0].parameters()):
param1.data = param2.data.to(gpu_device, non_blocking=False).detach()
for param1,param2 in zip(self.h[ram_blocks-1].parameters(),self.extrastorage[ram_blocks-1].parameters()):
param1.data = param2.data.to(gpu_device, non_blocking=False).detach()
#END MODEL BREAK EDITS
output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
output_hidden_states = (
output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
)
use_cache = use_cache if use_cache is not None else self.config.use_cache
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
if input_ids is not None and inputs_embeds is not None:
raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
elif input_ids is not None:
input_shape = input_ids.size()
input_ids = input_ids.view(-1, input_shape[-1])
batch_size = input_ids.shape[0]
elif inputs_embeds is not None:
input_shape = inputs_embeds.size()[:-1]
batch_size = inputs_embeds.shape[0]
else:
raise ValueError("You have to specify either input_ids or inputs_embeds")
device = input_ids.device if input_ids is not None else inputs_embeds.device
if token_type_ids is not None:
token_type_ids = token_type_ids.view(-1, input_shape[-1])
if position_ids is not None:
position_ids = position_ids.view(-1, input_shape[-1])
if past_key_values is None:
past_length = 0
past_key_values = tuple([None] * len(self.h))
else:
past_length = past_key_values[0][0].size(-2)
device = input_ids.device if input_ids is not None else inputs_embeds.device
if position_ids is None:
position_ids = torch.arange(past_length, input_shape[-1] + past_length, dtype=torch.long, device=device)
position_ids = position_ids.unsqueeze(0).view(-1, input_shape[-1])
# Attention mask.
if attention_mask is not None:
assert batch_size > 0, "batch_size has to be defined and > 0"
global_attention_mask = attention_mask.view(batch_size, -1)
# We create a 3D attention mask from a 2D tensor mask.
# Sizes are [batch_size, 1, 1, to_seq_length]
# So we can broadcast to [batch_size, num_heads, from_seq_length, to_seq_length]
# this attention mask is more simple than the triangular masking of causal attention
# used in OpenAI GPT, we just need to prepare the broadcast dimension here.
global_attention_mask = global_attention_mask[:, None, None, :]
# Since global_attention_mask is 1.0 for positions we want to attend and 0.0 for
# masked positions, this operation will create a tensor which is 0.0 for
# positions we want to attend and -10000.0 for masked positions.
# Since we are adding it to the raw scores before the softmax, this is
# effectively the same as removing these entirely.
global_attention_mask = global_attention_mask.to(dtype=self.dtype) # fp16 compatibility
global_attention_mask = (1.0 - global_attention_mask) * -10000.0
else:
global_attention_mask = None
# Local causal attention mask
batch_size, seq_length = input_shape
full_seq_length = seq_length + past_length
# Prepare head mask if needed
# 1.0 in head_mask indicate we keep the head
# attention_probs has shape bsz x num_heads x N x N
# head_mask has shape n_layer x batch x num_heads x N x N
head_mask = self.get_head_mask(head_mask, self.config.num_layers)
if inputs_embeds is None:
inputs_embeds = self.wte(input_ids)
if embs is not None and not (use_cache is not None and use_cache and past_key_values is not None and len(past_key_values) > 0 and past_key_values[0] is not None):
offset = 0
for pos, emb in embs:
pos += offset
if len(emb.shape) == 2:
emb = emb.repeat(input_shape[0], 1, 1)
inputs_embeds[:, pos:pos+emb.shape[1]] = emb
offset += emb.shape[1]
if hasattr(self, 'rotary') and self.rotary:
hidden_states = inputs_embeds
else:
position_embeds = self.wpe(position_ids)
hidden_states = inputs_embeds + position_embeds
if token_type_ids is not None:
token_type_embeds = self.wte(token_type_ids)
hidden_states = hidden_states + token_type_embeds
hidden_states = self.drop(hidden_states)
output_shape = input_shape + (hidden_states.size(-1),)
presents = () if use_cache else None
all_self_attentions = () if output_attentions else None
all_hidden_states = () if output_hidden_states else None
if breakmodel:
copystream = torch.cuda.Stream(device=0,priority = -1)
for i, (block, layer_past) in enumerate(zip(self.h, past_key_values)):
if breakmodel:
if i in range(ram_blocks):
index1 = (i+1)%ram_blocks
for param1,param2 in zip(self.h[index1].parameters(),self.h[(i-1)%ram_blocks].parameters()):
param1.data = param2.data
for param1,param2 in zip(self.h[index1].parameters(),self.extrastorage[index1].parameters()):
with torch.cuda.stream(copystream):
torch.cuda.comm.broadcast(param2.data,out = [param1.data])
attn_type = self.config.attention_layers[i]
attn_mask = global_attention_mask
if output_hidden_states:
all_hidden_states = all_hidden_states + (hidden_states.cpu(),)
if getattr(self.config, "gradient_checkpointing", False) and self.training:
if use_cache:
logger.warning(
"`use_cache=True` is incompatible with `config.gradient_checkpointing=True`. Setting "
"`use_cache=False`..."
)
use_cache = False
def create_custom_forward(module):
def custom_forward(*inputs):
# None for past_key_value
return module(*inputs, use_cache, output_attentions)
return custom_forward
outputs = torch.utils.checkpoint.checkpoint(
create_custom_forward(block),
hidden_states,
None,
attn_mask,
head_mask[i],
)
else:
outputs = block(
hidden_states,
layer_past=layer_past,
attention_mask=attn_mask,
head_mask=head_mask[i],
use_cache=use_cache,
output_attentions=output_attentions,
)
hidden_states = outputs[0]
if use_cache is True:
presents = presents + (outputs[1],)
if output_attentions:
all_self_attentions = all_self_attentions + (outputs[2 if use_cache else 1],)
if breakmodel:
if i in range(ram_blocks):
torch.cuda.synchronize()
torch.cuda.empty_cache()
if breakmodel:
del copystream
torch.cuda.empty_cache()
hidden_states = self.ln_f(hidden_states)
hidden_states = hidden_states.view(*output_shape)
# Add last hidden state
if output_hidden_states:
all_hidden_states = all_hidden_states + (hidden_states,)
if not return_dict:
return tuple(v for v in [hidden_states, presents, all_hidden_states, all_self_attentions] if v is not None)
return BaseModelOutputWithPast(
last_hidden_state=hidden_states,
past_key_values=presents,
hidden_states=all_hidden_states,
attentions=all_self_attentions,
)

15
commandline.bat Normal file
View File

@ -0,0 +1,15 @@
@echo off
cd %~dp0
TITLE CMD for KoboldAI Runtime
SET /P M=<loader.settings
IF %M%==1 GOTO drivemap
IF %M%==2 GOTO subfolder
:subfolder
call miniconda3\condabin\activate
cmd /k
:drivemap
subst K: miniconda3 >nul
call K:\python\condabin\activate
cmd /k

View File

@ -7,9 +7,11 @@ dependencies:
- colorama
- flask-socketio
- pytorch
- cudatoolkit=11.1
- tensorflow-gpu
- python=3.8.*
- pip
- git
- pip:
- git+https://github.com/finetuneanon/transformers@gpt-neo-localattention3-rp-b
- git+https://github.com/finetuneanon/transformers@gpt-neo-localattention3-rp-b
- flask-cloudflared

View File

@ -8,5 +8,10 @@ dependencies:
- colorama
- flask-socketio
- pytorch
- python=3.8.*
- cudatoolkit=11.1
- tensorflow-gpu
- transformers
- transformers
- pip
- pip:
- flask-cloudflared

View File

@ -1,6 +1,7 @@
import tkinter as tk
from tkinter import filedialog
from os import getcwd, listdir, path
import os
import json
#==================================================================#
@ -54,19 +55,34 @@ def getdirpath(dir, title):
else:
return None
#==================================================================#
# Returns the path (as a string) to the given story by its name
#==================================================================#
def storypath(name):
return path.join(path.dirname(path.realpath(__file__)), "stories", name + ".json")
#==================================================================#
# Returns an array of dicts containing story files in /stories
#==================================================================#
def getstoryfiles():
list = []
for file in listdir(getcwd()+"/stories"):
for file in listdir(path.dirname(path.realpath(__file__))+"/stories"):
if file.endswith(".json"):
ob = {}
ob["name"] = file.replace(".json", "")
f = open(getcwd()+"/stories/"+file, "r")
js = json.load(f)
f = open(path.dirname(path.realpath(__file__))+"/stories/"+file, "r")
try:
js = json.load(f)
except:
print(f"Browser loading error: {file} is malformed or not a JSON file.")
f.close()
continue
f.close()
ob["actions"] = len(js["actions"])
try:
ob["actions"] = len(js["actions"])
except TypeError:
print(f"Browser loading error: {file} has incorrect format.")
continue
list.append(ob)
return list
@ -74,4 +90,22 @@ def getstoryfiles():
# Returns True if json file exists with requested save name
#==================================================================#
def saveexists(name):
return path.exists(getcwd()+"/stories/"+name+".json")
return path.exists(storypath(name))
#==================================================================#
# Delete save file by name; returns None if successful, or the exception if not
#==================================================================#
def deletesave(name):
try:
os.remove(storypath(name))
except Exception as e:
return e
#==================================================================#
# Rename save file; returns None if successful, or the exception if not
#==================================================================#
def renamesave(name, new_name):
try:
os.replace(storypath(name), storypath(new_name))
except Exception as e:
return e

View File

@ -6,7 +6,7 @@ gensettingstf = [{
"min": 0.1,
"max": 2.0,
"step": 0.05,
"default": 1.0,
"default": 0.5,
"tooltip": "Randomness of sampling. High values can increase creativity but may make text less sensible. Lower values will make text more predictable but can become repetitious."
},
{
@ -14,11 +14,33 @@ gensettingstf = [{
"unit": "float",
"label": "Top p Sampling",
"id": "settopp",
"min": 0.1,
"min": 0.0,
"max": 1.0,
"step": 0.05,
"default": 1.0,
"tooltip": "Used to discard unlikely text in the sampling process. Lower values will make text more predictable but can become repetitious."
"default": 0.9,
"tooltip": "Used to discard unlikely text in the sampling process. Lower values will make text more predictable but can become repetitious. (Put this value on 1 to disable its effect)"
},
{
"uitype": "slider",
"unit": "int",
"label": "Top k Sampling",
"id": "settopk",
"min": 0,
"max": 100,
"step": 1,
"default": 0,
"tooltip": "Alternative sampling method, can be combined with top_p. (Put this value on 0 to disable its effect)"
},
{
"uitype": "slider",
"unit": "float",
"label": "Tail-free Sampling",
"id": "settfs",
"min": 0.0,
"max": 1.0,
"step": 0.05,
"default": 0.0,
"tooltip": "Alternative sampling method; it is recommended to disable top_p and top_k (set top_p to 1 and top_k to 0) if using this. 0.95 is thought to be a good value. (Put this value on 1 to disable its effect)"
},
{
"uitype": "slider",
@ -28,7 +50,7 @@ gensettingstf = [{
"min": 1.0,
"max": 2.0,
"step": 0.05,
"default": 1.0,
"default": 1.1,
"tooltip": "Used to penalize words that were already generated or belong to the context."
},
{
@ -39,7 +61,7 @@ gensettingstf = [{
"min": 16,
"max": 512,
"step": 2,
"default": 60,
"default": 80,
"tooltip": "Number of tokens the AI should generate. Higher numbers will take longer to generate."
},
{
@ -50,7 +72,7 @@ gensettingstf = [{
"min": 512,
"max": 2048,
"step": 8,
"default": 512,
"default": 1024,
"tooltip": "Max number of tokens of context to submit to the AI for sampling. Make sure this is higher than Amount to Generate. Higher values increase VRAM/RAM usage."
},
{
@ -72,7 +94,7 @@ gensettingstf = [{
"min": 1,
"max": 5,
"step": 1,
"default": 1,
"default": 3,
"tooltip": "Number of historic actions to scan for W Info keys."
},
{
@ -85,6 +107,17 @@ gensettingstf = [{
"step": 1,
"default": 1,
"tooltip": "Whether the prompt should be sent in the context of every action."
},
{
"uitype": "toggle",
"unit": "bool",
"label": "Adventure Mode",
"id": "setadventure",
"min": 0,
"max": 1,
"step": 1,
"default": 0,
"tooltip": "Turn this on if you are playing a Choose your Adventure model."
}]
gensettingsik =[{
@ -95,7 +128,7 @@ gensettingsik =[{
"min": 0.1,
"max": 2.0,
"step": 0.05,
"default": 1.0,
"default": 0.5,
"tooltip": "Randomness of sampling. High values can increase creativity but may make text less sensible. Lower values will make text more predictable but can become repetitious."
},
{
@ -103,12 +136,34 @@ gensettingsik =[{
"unit": "float",
"label": "Top p Sampling",
"id": "settopp",
"min": 0.1,
"min": 0.0,
"max": 1.0,
"step": 0.05,
"default": 1.0,
"default": 1.1,
"tooltip": "Used to discard unlikely text in the sampling process. Lower values will make text more predictable but can become repetitious."
},
{
"uitype": "slider",
"unit": "int",
"label": "Top k Sampling",
"id": "settopk",
"min": 0,
"max": 100,
"step": 1,
"default": 0,
"tooltip": "Alternative sampling method, can be combined with top_p."
},
{
"uitype": "slider",
"unit": "float",
"label": "Tail-free Sampling",
"id": "settfs",
"min": 0.0,
"max": 1.0,
"step": 0.05,
"default": 0.0,
"tooltip": "Alternative sampling method; it is recommended to disable (set to 0) top_p and top_k if using this. 0.95 is thought to be a good value."
},
{
"uitype": "slider",
"unit": "int",
@ -128,7 +183,7 @@ gensettingsik =[{
"min": 1,
"max": 5,
"step": 1,
"default": 1,
"default": 3,
"tooltip": "Number of historic actions to scan for W Info keys."
},
{
@ -141,6 +196,17 @@ gensettingsik =[{
"step": 1,
"default": 1,
"tooltip": "Whether the prompt should be sent in the context of every action."
},
{
"uitype": "toggle",
"unit": "bool",
"label": "Adventure Mode",
"id": "setadventure",
"min": 0,
"max": 1,
"step": 1,
"default": 0,
"tooltip": "Turn this on if you are playing a Choose your Adventure model."
}]
formatcontrols = [{

View File

@ -12,6 +12,7 @@ echo.
SET /P B=Type the number of the desired option and then press ENTER:
Reg add "HKLM\SYSTEM\CurrentControlSet\Control\FileSystem" /v "LongPathsEnabled" /t REG_DWORD /d "1" /f 2>nul
%~d0
cd %~dp0
if exist miniconda3\ (

20
notebook.bat Normal file
View File

@ -0,0 +1,20 @@
@echo off
cd %~dp0
TITLE Jupyter for KoboldAI Runtime
SET /P M=<loader.settings
IF %M%==1 GOTO drivemap
IF %M%==2 GOTO subfolder
:subfolder
umamba.exe install --no-shortcuts -r miniconda3 -n base -c conda-forge jupyter
call miniconda3\condabin\activate
jupyter notebook
cmd /k
:drivemap
subst K: miniconda3 >nul
umamba.exe install --no-shortcuts -r K:\python\ -n base -c conda-forge jupyter
call K:\python\condabin\activate
jupyter notebook
subst K: /D
cmd /k

View File

@ -1,18 +1,18 @@
@echo off
cd %~dp0
TITLE KoboldAI - Client
TITLE KoboldAI - Server
SET /P M=<loader.settings
IF %M%==1 GOTO drivemap
IF %M%==2 GOTO subfolder
:subfolder
call miniconda3\condabin\activate
python aiserver.py
python aiserver.py %*
cmd /k
:drivemap
subst K: miniconda3 >nul
call K:\python\condabin\activate
python aiserver.py
python aiserver.py %*
subst K: /D
cmd /k

156
readme.md Normal file
View File

@ -0,0 +1,156 @@
# KoboldAI - Your gateway to GPT writing
This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models. It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load, adjustable AI settings, formatting options, and the ability to import existing AI Dungeon adventures. You can also turn on Adventure mode and play the game like AI Dungeon Unleashed.
## Multiple ways to play
Stories can be played like a Novel, or played like a text adventure game with an easy toggle to change between the two gameplay styles. This makes KoboldAI both a writing assistant and a game. The way you play and how good the AI will be depends on the model or service you decide to use. No matter if you want to use the free, fast power of Google Colab. Your own high end graphics card, an online service you have an API key for (Like OpenAI or Inferkit) or if you rather just run it slower on your CPU you will be able to find a way to use KoboldAI that works for you.
### Adventure mode
By default KoboldAI will run in a generic mode optimized for writing, but with the right model you can play this like AI Dungeon without any issues. You can enable this in the settings and bring your own prompt, try generating a random prompt or download one of the prompts available at [prompts.aidg.club](https://prompts.aidg.club) .
The gameplay will be slightly different than the gameplay in AI Dungeon because we adopted the style of the Unleashed fork, giving you full control over all the characters because we do not automatically adapt your sentences behind the scenes. This means you can more reliably control characters that are not you.
As a result of this what you need to type is slightly different, in AI Dungeon you would type ***take the sword*** while in KoboldAI you would type it like a sentence such as ***You take the sword*** and this is best done with the word You instead of I.
To speak simply type : *You say "We should probably gather some supplies first"*
Just typing the quote might work, but the AI is at its best when you specify who does what in your commands.
If you want to do this with your friends we advice using the main character as You and using the other characters by their name if you are playing on a model trained for Adventures. These models assume there is a You in the story. This mode does usually not perform well on Novel models because they do not know how to handle the input those are best used with regular story writing where you take turns with the AI.
### Writing assistant
If you want to use KoboldAI as a writing assistant this is best done in the regular mode with a model optimized for Novels. These models do not make the assumption that there is a You character and focus on Novel like writing. For writing these will often give you better results than Adventure or Generic models. That said, if you give it a good introduction to the story large generic models like 6B can be used if a more specific model is not available for what you wish to write. You can also try to use models that are not specific to what you wish to do, for example a NSFW Novel model for a SFW story if a SFW model is unavailable. This will mean you will have to correct the model more often because of its bias, but can still produce good enough results if it is familiar enough with your topic.
## Play KoboldAI online for free on Google Colab (The easiest way to play)
We provide multiple ready made versions to get you going, click on the name for a link to the specific version. These run entirely on Google's Servers and will automatically upload saves to your Google Drive if you choose to manually save a story. Each version has slightly different instructions on how to use them (Many need some space on your google drive to run, others may need some manual steps) that are listed on the page.
TPU editions work on any configuration of TPU Google gives out at the time of writing. GPU editions are subject to a GPU lottery and may crash on launch if you are unlucky (Especially if a lot of users are using up the good GPU's or you have been using Colab often).
[Click here to open the Recommended version](https://henk.tech/colabkobold)
| Version | Model | Size | Style | Description |
| ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | --------------- | ------------------------------------------------------------ |
| [Adventure 6B](https://colab.research.google.com/drive/1vdAsD0xCc_YsAXqBUxb_QAwPOXkFJtxm?usp=sharing#sandboxMode=true) | [gpt-j-6b-adventure-jax](https://wandb.ai/ve-forbryderne/adventure/runs/carol-data/files/models) by ve_forbryderne (Download the -hf version if you plan to run this locally) | 6B TPU | Adventure | This is the Recommended version for AI Dungeon players, this is effectively a Free Griffin but with more control. This Colab edition provides better memory than Griffin would have given you, allowing for a more coherent experience. And while it will still generate characters like The Great Litch Lord that AI Dungeon players are familiar with it was trained on stories beyond AI Dungeon and is more balanced in its approaches. This is a TPU edition so it can fit a lot in memory |
| [Skein](https://colab.research.google.com/drive/1ZAKgkSyyfiZN87npKYaRM8vL4OF2Btfg?usp=sharing#sandboxMode=true) | gpt-j-6b-skein-jax by ve_forbryderne (Download the -hf version if you plan to run this locally) | 6B TPU | Novel/Adventure | Skein is a hybrid between a Novel model and the Adventure model. Because of this it needs a bit more context about the writing style (Needing a few retries in the random story generator if you use this). It was trained on both Light Novels and choose your own adventure stories along side extra information to help it understand story themes better. It is recommended to play this with Adventure mode enabled to prevent it from doing "Actions" even if you wish to use it for Novel writing. If you wish to use it for Novel writing you can do this by toggling the input to Story. |
| [Generic 6B TPU](https://colab.research.google.com/drive/1pG9Gz9PrqklNBESPNaXvfctMVnvwf_Q8#forceEdit=true&sandboxMode=true&scrollTo=jcxnaOk5Th4x) | [Original GPT-6-JAX Slim](https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.gz) (Requires a TPU and does not work local) | 6B TPU | Novel | The recommended model if you want a generic experience. This model is not optimized for anything in particular and works best when you give it a longer introduction. Make sure to include examples for the AI to learn from and write the first part of the story entirely yourself. Then it should be able to learn from your style and continue from there. Very sensitive to a high temp because it knows webpages and code, so when configured incorrectly it will easily end a story with 'Rate my blogpost, follow me on twitter' and the likes. |
| [Horni](https://colab.research.google.com/drive/1QwjkK_JeK9aYEkyM_6nrJXQARFMnBDmG?usp=sharing#sandboxMode=true) (Formerly Novel/NSFW) | [GPT-Neo-2.7B-Horni](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-horni.tar) by finetune | 2.7B GPU | Novel | One of the oldest models in our collection, tuned on Literotica to produce a Novel style model optimized towards NSFW content. Can still be used for SFW stories but will have a bias towards NSFW content. Because this is an older 2.7B model it is only compatible as a GPU instance. Most GPU's in Colab are powerful enough to run this well but it will crash if you get something weak like a Nvidia P7. |
| [Picard](https://colab.research.google.com/drive/1VNVKtbPaTcmkQzy8bEQkd9SUiUJBdbEL?usp=sharing#sandboxMode=true) | [Picard](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-picard.7z) by Mr Seeker | 2.7B GPU | Novel | Picard is a model trained for SFW Novels based on GPT-Neo-2.7B. It is focused on Novel style writing without the NSFW bias. While the name suggests a sci-fi model this model is designed for Novels of a variety of genre's. Most GPU's in Colab are powerful enough to run this well but it will crash if you get something weak like a Nvidia P7. |
| [Shinen](https://colab.research.google.com/drive/1-7Lkj-np2DaSnmq1OdPYkel6W2rh4E-0?usp=sharing#sandboxMode=true) | [Shinen](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-shinen.7z) by Mr Seeker | 2.7B GPU | Novel | Shinen is an alternative to the Horni model designed to be more explicit. If Horni is to tame for you shinen might produce better results. While it is a Novel model it is unsuitable for SFW stories due to its heavy NSFW bias. Shinen will not hold back. Most GPU's in Colab are powerful enough to run this well but it will crash if you get something weak like a Nvidia P7. |
## Install KoboldAI on your own computer
KoboldAI has a large number of dependencies you will need to install on your computer, unfortunately Python does not make it easy for us to provide instructions that work for everyone. The instructions below will work on most computers, but if you have multiple versions of Python installed conflicts can occur.
### Downloading the latest version of KoboldAI
KoboldAI is a rolling release on our github, the code you see is also the game. The easiest way to download the game is by clicking on the green Code button at the top of the page and clicking Download ZIP.
### Installing KoboldAI on Windows 10 or higher using the KoboldAI Runtime Installer
1. Extract the .zip to a location you wish to install KoboldAI, you will need roughly 20GB of free space for the installation (this does not include the models).
2. Open install_requirements.bat as administrator.
3. Choose either the Finetuneanon or the Regular version of transformers (Finetuneanon works better for GPU players but breaks CPU mode, only use this version if you have a modern Nvidia GPU with enough VRAM for the model you wish to run).
4. You will now be asked to choose the installation mode, we **strongly** recommend the Temporary K: drive option for anyone who does not already have a K: drive on their computer. This option eliminates most installation issues and also makes KoboldAI portable. The K: drive will be gone after a reboot and will automatically be recreated each time you play KoboldAI.
5. The installation will now automatically install its requirements, some stages may appear to freeze do not close the installer until it asks you to press a key. Before pressing a key to exit the installer please check if errors occurred. Most problems with the game crashing are related to installation/download errors. Disabling your antivirus can help if you get errors.
6. Use play.bat to play the game.
### Manual installation / Linux / Mac
We can not provide a step by step guide for manual installation due to the vast differences between the existing software configuration and the systems of our users.
If you would like to manually install KoboldAI you will need some python/conda package management knowledge to manually do one of the following steps :
1. Use our bundled environments files to install your own conda environment, this should also automatically install CUDA.
2. If you do not want to use conda install the requirements listed in requirements.txt and make sure that CUDA is properly installed.
3. Adapt and use our bundled docker files to create your own KoboldAI docker instance.
### Using an AMD GPU on Linux
AMD GPU's have terrible compute support, this will currently not work on Windows and will only work for a select few Linux GPU's. [You can find a list of the compatible GPU's here](https://github.com/RadeonOpenCompute/ROCm#Hardware-and-Software-Support). Any GPU that is not listed is guaranteed not to work with KoboldAI and we will not be able to provide proper support on GPU's that are not compatible with the versions of ROCm we require. This guide requires that you already followed the appropriate steps to configure both [ROCm](https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html) and [Docker]([Install Docker Engine | Docker Documentation](https://docs.docker.com/engine/install/)) and is for advanced users only.
1. Make sure you have installed both the latest version of [Docker](https://docs.docker.com/engine/install/), docker-compose and [ROCm](https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html) on your system and have configured your user to have access to the Docker group (Sudo can interfere with the dialogues).
2. Assign our play-rocm.sh file execute permissions (chmod +x play-rocm.sh).
3. Run our play-rocm.sh file, it should now automatically install and create a suitable runtime for KoboldAI with AMD support and directly run the game afterwards. For X11 forwarding support you will need to run this as sudo at least once at the local machine. Otherwise use the command line options to load KoboldAI if you are playing this remotely.
4. Currently models automatically downloaded by the game are discarded on exit in the Docker version, it is strongly recommended that you manually download a model and load this using the custom model features to prevent unnecessary downloads.
If you hit strange errors with the ROCm version where it fails on the installation be sure you are running the latest version of Docker and Docker-compose. Some versions will fail on the root elevation or lack the appropriate formats.
### Troubleshooting
There are multiple things that can go wrong with the way Python handles its dependencies, unfortunately we do not have direct step by step solutions for every scenario but there are a few common solutions you can try.
#### ModuleNotFoundError
This is ALWAYS either a download/installation failure or a conflict with other versions of Python. This is very common if users chose the subfolder option during the installation while putting KoboldAI in a location that has spaces in the path. When an antivirus sandboxes the installation or otherwise interferes with the downloads, systems with low disk space or when your operating system was not configured for Long FIle Paths (The installer will do this on Windows 10 and higher if you run it as administrator, anything other than Windows 10 is not supported by our installers).
Another reason the installation may have failed is if you have conflicting installations of Python on your machine, if you press the Windows Key + R and enter %appdata% in the Run Dialog it will open the folder Python installs dependencies on some systems. If you have a Python folder in this location rename this folder and try to run the installer again. It should now no longer get stuck on existing dependencies. Try the game and see if it works well. If it does you can try renaming the folder back to see if it remains functional.
The third reason the installation may have failed is if you have conda/mamba on your system for other reasons, in that case we recommend either removing your existing installations of python/conda if you do not need them and testing our installer again. Or using conda itself with our bundled environment files to let it create its runtime manually. **Keep in mind that if you go the manual route you should NEVER use play.bat but should instead run aiserver.py directly**.
In general, the less versions of Python you have on your system the higher your chances of it installing correctly. We are consistently trying to mitigate these installation conflicts in our installers but for some users we can not yet avoid all conflicts.
#### GPU not found errors
GPU not found errors can be caused by one of two things, either you do not have a suitable Nvidia GPU (It needs Compute Capability 5.0 or higher to be able to play KoboldAI). Your Nvidia GPU is supported by KoboldAI but is not supported by the latest version of CUDA. Your Nvidia GPU is not yet supported by the latest version of CUDA or you have a dependency conflict like the ones mentioned above.
Like with Python version conflicts we recommend uninstalling CUDA from your system if you have manually installed it and do not need it for anything else and trying again. If your GPU needs CUDA10 to function open environments\finetuneanon.yml and add a line that says - cudatoolkit=10.2 underneath dependencies: . After this you can run the installer again (Pick the option to delete the existing files) and it will download a CUDA10 compatible version.
If you do not have a suitable Nvidia GPU that can run on CUDA10 or Higher and that supports Compute Capabilities 5.0 or higher we can not help you get the game detected on the GPU. Unless you are following our ROCm guide with a compatible AMD GPU.
#### "LayerNormKernelImpl" not implemented for 'Half'
This error only occurs when you are trying to run a model on the CPU mode while Finetuneanon's version of Transformers is installed. If you want/need to use the CPU mode use the install_requirements.bat file with the Official Transformers option and choose to delete all existing files.
#### vocab.json / config.json is not found error
If you get these errors you either did not select the correct folder for your custom model or the model you have downloaded is not (yet) compatible with KoboldAI. There exist a few models out there that are compatible and provide a pytorch_model.bin file but do not ship all the required files. In this case try downloading a compatible model of the same kind (For example another GPT-Neo if you downloaded a GPT-Neo model) and replace the pytorch_model.bin file with the one you are trying to run. Chances are this will work fine.
## KoboldAI Compatible Models
The models listed in the KoboldAI menu are generic models meant to easily get you going based on the Huggingface service. For higher quality models and fully offline use you will need to manually download a suitable model for your style. These are some of the models the community has available for you all tested to be compatible with KoboldAI and will be the brain of the AI.
| **Model** | Type | **(V)RAM** | Repetition Penalty | Description |
| ------------------------------------------------------------ | --------------------------------- | ---------- | ------------------ | ------------------------------------------------------------ |
| [gpt-j-6b-adventure-jax-hf](https://api.wandb.ai/files/ve-forbryderne/adventure/carol-data/models/gpt-j-6b-adventure-hf.7z) | Adventure / 6B / Neo Custom | 16GB | 1.2 | This model has been trained on the AI Dungeon set with additional stories thrown in. It is the most well rounded AI Dungeon like model and can be seen as an improved Griffin. If you wish to play KoboldAI like AI Dungeon this is the one to pick. It works great with the random story generator if your temp is 0.5 . |
| [gpt-j-6b-skein-jax-hf](https://api.wandb.ai/files/ve-forbryderne/skein/files/gpt-j-6b-skein-hf.7z) | Adventure Novel / 6B / Neo Custom | 16GB | 1.1 | A hybrid of a few different datasets aimed to create a balanced story driven experience. If the adventure model is to focused on its own adventures and you want something a bit more generic this is the one for you. This model understands tags and adventure mode but can also be used as a writing assistant for your Novel. Its a good middle ground between a finetuned model and a generic model. It needs more guidance than some of the other models do making it less suitable for random story generation, but still focusses on writing rather than websites or code. If you want to use a model for existing story idea's this is a great choice. |
| [gpt-neo-2.7B-aid](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-aid.7z) | Adventure / 2.7B / Neo Custom | 8GB | 2.0 | This is one of the closest replications of the original AI Dungeon Classic model. Tuned on the same data that got uploaded alongside AI Dungeon. In KoboldAI we noticed this model performs better than the conversions of the original AI Dungeon model. It has all the traits you expect of AI Dungeon Classic while not having as many artifacts as this model was trained specifically for KoboldAI. Must be played with Adventure mode enabled to prevent it from doing actions on your behalf. |
| [gpt-neo-2.7B-horni](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-horni.tar) | Novel / 2.7B / Neo Custom | 8GB | 2.0 | One of the best novel models available for 2.7B focused on NSFW content. This model trains the AI to write in a story like fashion using a very large collection of Literotica stories. It is one of the original finetuned models for 2.7B. |
| [gpt-neo-2.7B-horni-ln](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-horni-ln.7z) | Novel / 2.7B / Neo Custom | 8GB | 2.0 | This model is much like the one above, but has been additionally trained on regular light novels. More likely to go SFW and is more focused towards themes found in these light novels over general cultural references. This is a good model for Novel writing especially if you want to add erotica to the mix. |
| [gpt-neo-2.7B-picard](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-picard.7z) | Novel / 2.7B / Neo Custom | 8GB | 2.0 | Picard is another Novel model, this time exclusively focused on SFW content of various genres. Unlike the name suggests this goes far beyond Star Trek stories and is not exclusively sci-fi. |
| [gpt-neo-2.7B-shinen](https://storage.henk.tech/KoboldAI/gpt-neo-2.7B-shinen.7z) | Novel / 2.7B / Neo Custom | 8GB | 2.0 | The most NSFW of them all, Shinen WILL make things sexual. This model will assume that whatever you are doing is meant to be a sex story and will sexualize constantly. It is designed for people who find Horni to tame. It was trained on SexStories instead of Literotica and was trained on tags making it easier to guide the AI to the right context. |
| [GPT-J-6B (Converted)](https://storage.henk.tech/KoboldAI/gpt-j-6b.7z) | Generic / 6B / Neo Custom | 16GB | 1.1 | This is the basis for all the other GPT-J-6B models, it has been trained on The Pile and is an open alternative for GPT Curie. Because it is a generic model it is not particularly good at anything and needs a long introduction to understand what you want to do. It is however the most flexible because it has no bias. If you want to do something that has no specific model available, such as writing a webpage article or coding this can be a good one to try. This specific version was converted by our community to be able to run as a GPT-Neo model on your GPU. |
| [AID-16Bit](https://storage.henk.tech/KoboldAI/aid-16bit.zip) | Adventure / 1.5B / GPT-2 Custom | 4GB | 2.0 | The original AI Dungeon Classic model converted to Pytorch and then converted to a 16-bit Model making it half the size. |
| [model_v5_pytorch](https://storage.henk.tech/KoboldAI/model_v5_pytorch.zip) (AI Dungeon's Original Model) | Adventure / 1.5B / GPT-2 Custom | 8GB | 2.0 | This is the original AI Dungeon Classic model converted to the Pytorch format compatible with AI Dungeon Clover and KoboldAI. We consider this model inferior to the GPT-Neo version because it has more artifacting due to its conversion. This is however the most authentic you can get to AI Dungeon Classic. |
| [Novel 774M](https://storage.henk.tech/KoboldAI/Novel%20model%20774M.rar) | Novel / 774M / GPT-2 Custom | 4GB | 2.0 | Novel 774M is made by the AI Dungeon Clover community, because of its small size and novel bias it is more suitable for CPU players that want to play with speed over substance or players who want to test a GPU with a low amount of VRAM. These performance savings are at the cost of story quality and you should not expect the kind of in depth story capabilities that the larger models offer. It was trained for SFW stories. |
| [Smut 774M](https://storage.henk.tech/KoboldAI/Smut%20model%20774M%2030K.rar) | Novel / 774M / GPT-2 Custom | 4GB | 2.0 | The NSFW version of the above, its a smaller GPT-2 based model made by the AI Dungeon Clover community. Gives decent speed on a CPU at the cost of story quality like the other 774M models. |
| [Mia](https://storage.henk.tech/KoboldAI/Mia.7z) | Adventure / 125M / Neo Custom | 1GB | 2.0 | Mia is the smallest Adventure model, it runs at very fast speeds on the CPU which makes it a good testing model for developers who do not have GPU access. Because of its small size it will constantly attempt to do actions on behalf of the player and it will not produce high quality stories. If you just need a small model for a quick test, or if you want to take the challenge of trying to run KoboldAI entirely on your phone this would be an easy model to use due to its small RAM requirements and fast (loading) speeds. |
## Contributors
This project contains work from the following contributors :
- The Gantian - Creator of KoboldAI, has created most features such as the interface, the different AI model / API integrations and in general the largest part of the project.
- VE FORBRYDERNE - Contributed many features such as the Editing overhaul, Adventure Mode, expansions to the world info section, breakmodel integration and much more.
- Henk717 - Contributed the installation scripts, this readme, random story generator, the docker scripts, the foundation for the commandline interface and other smaller changes as well as integrating multiple parts of the code of different forks to unite it all. Not all code Github attributes to Henk717 is by Henk717 as some of it has been integrations of other people's work. We try to clarify this in the contributors list as much as we can.
- Frogging101 - top_k / tfs support
- UWUplus (Ralf) - Contributed storage systems for community colabs, as well as cleaning up and integrating the website dependencies/code better. He is also the maintainer of flask-cloudflared which we use to generate the cloudflare links.
- Javalar - Initial Performance increases on the story_refresh
- LexSong - Initial environment file adaptation for conda that served as a basis for the install_requirements.bat overhaul.
- Arrmansa - Breakmodel support for other projects that served as a basis for VE FORBRYDERNE's integration.
As well as various Model creators who will be listed near their models, and all the testers who helped make this possible!
Did we miss your contribution? Feel free to issue a commit adding your name to this list.
## License
KoboldAI is licensed with a AGPL license, in short this means that it can be used by anyone for any purpose. However, if you decide to make a publicly available instance your users are entitled to a copy of the source code including all modifications that you have made (which needs to be available trough an interface such as a button on your website), you may also not distribute this project in a form that does not contain the source code (Such as compiling / encrypting the code and distributing this version without also distributing the source code that includes the changes that you made. You are allowed to distribute this in a closed form if you also provide a separate archive with the source code.).
umamba.exe is bundled for convenience because we observed that many of our users had trouble with command line download methods, it is not part of our project and does not fall under the AGPL license. It is licensed under the BSD-3-Clause license.

View File

@ -1,80 +0,0 @@
Thanks for checking out the KoboldAI Client! Get support and updates on the subreddit:
https://www.reddit.com/r/KoboldAI/
[ABOUT]
This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models.
It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load,
adjustable AI settings, formatting options, and the ability to import exising AI Dungeon adventures.
Current UI Snapshot: https://imgur.com/mjk5Yre
For local generation, KoboldAI uses Transformers (https://huggingface.co/transformers/) to interact
with the AI models. This can be done either on CPU, or GPU with sufficient hardware. If you have a
high-end GPU with sufficient VRAM to run your model of choice, see
(https://www.tensorflow.org/install/gpu) for instructions on enabling GPU support.
Transformers/Tensorflow can still be used on CPU if you do not have high-end hardware, but generation
times will be much longer. Alternatively, KoboldAI also supports utilizing remotely-hosted models.
The currently supported remote APIs are InferKit and Google Colab, see the dedicated sections below
for more info on these.
[SETUP]
1. Install a 64-bit version of Python.
(Development was done on 3.7, I have not tested newer versions)
Windows download link: https://www.python.org/ftp/python/3.7.9/python-3.7.9-amd64.exe
2. When installing Python make sure "Add Python to PATH" is selected.
(If pip isn't working, run the installer again and choose Modify to choose Optional features.)
3. Run install_requirements.bat.
(This will install the necessary python packages via pip)
4. Run play.bat
5. Select a model from the list. Flask will start and give you a message that it's ready to connect.
6. Open a web browser and enter http://127.0.0.1:5000/
[ENABLE COLORS IN WINDOWS 10 COMMAND LINE]
If you see strange numeric tags in the console output, then your console of choice does not have
color support enabled. On Windows 10, you can enable color support by lanching the registry editor
and adding the REG_DWORD key VirtualTerminalLevel to Computer\HKEY_CURRENT_USER\Console and setting
its value to 1.
[ENABLE GPU FOR SUPPORTED VIDEO CARDS]
1. Install NVidia CUDA toolkit from https://developer.nvidia.com/cuda-10.2-download-archive
2. Visit PyTorch's website(https://pytorch.org/get-started/locally/) and select Pip under "Package"
and your version of CUDA under "Compute Platform" (I linked 10.2) to get the pip3 command.
3. Copy and paste pip3 command into command prompt to install torch with GPU support
Be aware that when using GPU mode, inference will be MUCH faster but if your GPU doesn't have enough
VRAM to load the model it will crash the application.
[IMPORT AI DUNGEON GAMES]
To import your games from AI Dungeon, first grab CuriousNekomimi's AI Dungeon Content Archive Toolkit:
https://github.com/CuriousNekomimi/AIDCAT
Follow the video instructions for getting your access_token, and run aidcat.py in command prompt.
Choose option [1] Download your saved content.
Choose option [2] Download your adventures.
Save the JSON file to your computer using the prompt.
Run KoboldAI, and after connecting to the web GUI, press the Import button at the top.
Navigate to the JSON file exported from AIDCAT and select it. A prompt will appear in the GUI
presenting you with all Adventures scraped from your AI Dungeon account.
Select an Adventure and click the Accept button.
[HOST GPT-NEO ON GOOGLE COLAB]
If your computer does not have an 8GB GPU to run GPT-Neo locally, you can now run a Google Colab
notebook hosting a GPT-Neo-2.7B model remotely and connect to it using the KoboldAI client.
See the instructions on the Colab at the link below:
https://colab.research.google.com/drive/1uGe9f4ruIQog3RLxfUsoThakvLpHjIkX?usp=sharing
[FOR INFERKIT INTEGRATION]
If you would like to use InferKit's Megatron-11b model, sign up for a free account on their website.
https://inferkit.com/
After verifying your email address, sign in and click on your profile picture in the top right.
In the drop down menu, click "API Key".
On the API Key page, click "Reveal API Key" and copy it. When starting KoboldAI and selecting the
InferKit API model, you will be asked to paste your API key into the terminal. After entering,
the API key will be stored in the client.settings file for future use.
You can see your remaining budget for generated characters on their website under "Billing & Usage".

1
remote-play.bat Normal file
View File

@ -0,0 +1 @@
play --remote %*

View File

@ -1,6 +1,7 @@
transformers == 4.5.1
git+https://github.com/finetuneanon/transformers@gpt-neo-localattention3-rp-b
tensorflow-gpu
Flask == 1.1.2
Flask-SocketIO == 5.0.1
requests == 2.25.1
torch == 1.8.1
torch == 1.8.1
flask-cloudflared

File diff suppressed because it is too large Load Diff

View File

@ -6,6 +6,20 @@ chunk {
color: #ffffff;
}
#gametext.adventure action {
color: #9ff7fa;
font-weight: bold;
}
chunk[contenteditable="true"]:focus, chunk[contenteditable="true"]:focus * {
color: #cdf !important;
font-weight: normal !important;
}
chunk, chunk * {
outline: 0px solid transparent;
}
#topmenu {
background-color: #337ab7;
padding: 10px;
@ -53,7 +67,7 @@ chunk {
}
#gamescreen {
height: 500px;
height: 490px;
margin-top: 10px;
padding: 10px;
display: flex;
@ -72,6 +86,7 @@ chunk {
#gametext {
max-height: 100%;
width: 100%;
word-wrap: break-word;
}
#seqselmenu {
@ -97,12 +112,21 @@ chunk {
margin-left: 20px;
}
#inputrow.show_mode {
grid-template-columns: 7% 83% 10%;
}
#inputrow {
margin-top: 10px;
padding: 0px;
width: 100%;
display: grid;
grid-template-columns: 90% 10%;
grid-template-columns: 0% 90% 10%;
}
#inputrowmode {
position: relative;
padding-right: 0px;
}
#inputrowleft {
@ -121,6 +145,13 @@ chunk {
color: #ffffff;
}
#btnmode {
width: 100%;
height: 100%;
overflow: auto;
overflow-x: hidden;
}
#btnsend {
width: 100%;
height: 100%;
@ -163,7 +194,7 @@ chunk {
position: absolute;
top: 0px;
left: 0px;
z-index: 1;
z-index: 3;
width: 100%;
height: 100%;
background-color: rgba(0,0,0,0.5);
@ -240,12 +271,6 @@ chunk {
margin-top: 200px;
}
#saveasoverwrite {
color: #ff9900;
font-weight: bold;
text-align: center;
}
#loadpopup {
width: 500px;
background-color: #262626;
@ -260,6 +285,18 @@ chunk {
}
}
#loadpopupdelete {
width: 350px;
background-color: #262626;
margin-top: 200px;
}
#loadpopuprename {
width: 350px;
background-color: #262626;
margin-top: 200px;
}
#loadlistcontent {
height: 325px;
overflow-y: scroll;
@ -271,6 +308,12 @@ chunk {
margin-top: 200px;
}
#rspopup {
width: 800px;
background-color: #262626;
margin-top: 200px;
}
/*================= Classes =================*/
.aidgpopupcontent {
@ -282,6 +325,12 @@ chunk {
text-align: center;
}
.dialogheader {
padding: 10px 40px 10px 40px;
color: #737373;
text-align: center;
}
.anotelabel {
font-size: 10pt;
color: #ffffff;
@ -291,15 +340,36 @@ chunk {
width: 100px;
}
.box {
border-radius: 5px;
border: 1px solid #646464;
padding: 4px;
background: #373737;
}
.box-label {
color: #ffffff;
padding-left: 10px;
padding-right: 10px;
padding-bottom: 5px;
padding-top: 5px;
display: inline-block;
}
.chunkhov:hover {
color: #c0fc51;
cursor: pointer;
}
.colorfade {
.chunkhov:hover > action {
color: #00fa00;
}
.colorfade, .colorfade * {
-moz-transition:color 1s ease-in;
-o-transition:color 1s ease-in;
-webkit-transition:color 1s ease-in;
-o-transition:color 1s ease-in;
-webkit-transition:color 1s ease-in;
transition:color 1s ease-in;
}
.color_orange {
@ -339,8 +409,17 @@ chunk {
text-decoration: none;
}
.edit-flash, .edit-flash * {
color: #3bf723 !important;
}
.flex {
display: flex;
align-items: center;
}
.flex-push-right {
margin-left: auto;
}
.formatcolumn {
@ -376,21 +455,21 @@ chunk {
}
.helpicon {
display: inline-block;
font-family: sans-serif;
font-weight: bold;
text-align: center;
width: 2.2ex;
height: 2.4ex;
font-size: 1.4ex;
line-height: 1.8ex;
border-radius: 1.2ex;
margin-right: 4px;
padding: 1px;
color: #295071;
background: #ffffff;
border: 1px solid white;
text-decoration: none;
display: inline-block;
font-family: sans-serif;
font-weight: bold;
text-align: center;
width: 2.2ex;
height: 2.4ex;
font-size: 1.4ex;
line-height: 1.8ex;
border-radius: 1.2ex;
margin-right: 4px;
padding: 1px;
color: #295071;
background: #ffffff;
border: 1px solid white;
text-decoration: none;
}
.helpicon:hover {
@ -426,22 +505,85 @@ chunk {
text-align: right;
}
.loadlistheader {
padding-left: 10px;
.layer-container {
display: grid;
grid-template-columns: 80% 20%;
}
.layer-bottom {
grid-area: 1/1;
z-index: 0;
}
.layer-top {
grid-area: 1/1;
z-index: 2;
}
.icon-container {
position: relative;
}
.constant-key-icon {
position: absolute !important;
top: 5px !important;
right: 5px !important;
z-index: 1;
transform: rotate(20deg);
-moz-transform: rotate(20deg);
-webkit-transform: rotate(20deg);
-ms-transform: rotate(20deg);
-o-transform: rotate(20deg);
opacity: 20%;
}
*:hover > .constant-key-icon {
opacity: 40%;
}
.constant-key-icon:hover {
opacity: 65%;
cursor: pointer;
}
.constant-key-icon-enabled {
color: #3bf723;
opacity: 65%
}
*:hover > .constant-key-icon-enabled {
opacity: 65%;
}
.constant-key-icon-enabled:hover {
opacity: 100%
}
.constant-key-icon-clickthrough {
opacity: 0% !important;
pointer-events: none;
}
.constant-key-icon-clickthrough.constant-key-icon-enabled {
opacity: 35% !important;
}
.loadlistheader {
padding-left: 68px;
padding-right: 20px;
display: flex;
color: #737373;
}
.loadlistitem {
padding: 5px 10px 5px 10px;
display: grid;
grid-template-columns: 80% 20%;
display: flex;
flex-grow: 1;
color: #ffffff;
-moz-transition: background-color 0.25s ease-in;
-o-transition: background-color 0.25s ease-in;
-webkit-transition: background-color 0.25s ease-in;
-o-transition: background-color 0.25s ease-in;
-webkit-transition: background-color 0.25s ease-in;
transition: background-color 0.25s ease-in;
}
.loadlistitem:hover {
@ -449,13 +591,37 @@ chunk {
background-color: #688f1f;
}
.loadlistpadding {
padding-right: 10px;
}
.loadlisticon {
color: #333
}
.loadlisticon.allowed {
color: #ddd
}
.loadlisticon.allowed:hover {
cursor: pointer;
}
.loadlisticon-delete.allowed:hover {
color: #ef2929
}
.loadlisticon-rename.allowed:hover {
color: #fce94f
}
.navbar .navbar-nav .nav-link:hover {
border-radius: 5px;
border-radius: 5px;
background-color: #98bcdb;
}
.navbar .navbar-nav .nav-link:focus {
border-radius: 5px;
border-radius: 5px;
background-color: #98bcdb;
}
@ -498,11 +664,15 @@ chunk {
}
.nowrap {
white-space: nowrap;
}
.popupcontainer {
position: absolute;
top: 0px;
left: 0px;
z-index: 1;
z-index: 3;
width: 100%;
height: 100%;
background-color: rgba(0,0,0,0.5);
@ -517,8 +687,9 @@ chunk {
color: #ffffff;
-moz-transition: background-color 0.25s ease-in;
-o-transition: background-color 0.25s ease-in;
-webkit-transition: background-color 0.25s ease-in;
-o-transition: background-color 0.25s ease-in;
-webkit-transition: background-color 0.25s ease-in;
transition: background-color 0.25s ease-in;
}
.popuplistitem:hover {
@ -543,6 +714,11 @@ chunk {
font-size: 12pt;
}
.popuperror {
color: #ef2929;
text-align: center;
}
.popupfooter {
width: 100%;
padding: 10px;
@ -557,6 +733,12 @@ chunk {
margin-right: 10px;
}
.saveasoverwrite {
color: #ff9900;
font-weight: bold;
text-align: center;
}
.seqselheader {
color: #737373;
}
@ -567,8 +749,9 @@ chunk {
padding: 5px;
color: #ffffff;
-moz-transition: all 0.15s ease-in;
-o-transition: all 0.15s ease-in;
-webkit-transition: all 0.15s ease-in;
-o-transition: all 0.15s ease-in;
-webkit-transition: all 0.15s ease-in;
transition: all 0.15s ease-in;
}
.seqselitem:hover {
@ -617,15 +800,20 @@ chunk {
width: 50px;
}
.width-auto {
width: auto;
}
.wilistitem {
height: 80px;
display: grid;
grid-template-columns: 4% 30% 66%;
grid-template-columns: 4% 30% 58% 8%;
margin-bottom: 10px;
}
.wientry {
padding-left: 10px;
padding-right: 10px;
background-color: #212122;
}
@ -642,7 +830,6 @@ chunk {
}
.wikey > input {
height: 100%;
background-color: #404040;
color: #ffffff;
}
@ -651,4 +838,8 @@ chunk {
width: 80%;
overflow: hidden;
font-size: 12pt;
}
}
.wiselective > button {
white-space: normal;
}

118
static/open-iconic-bootstrap.min.css vendored Normal file

File diff suppressed because one or more lines are too long

BIN
static/open-iconic.woff Normal file

Binary file not shown.

40
structures.py Normal file
View File

@ -0,0 +1,40 @@
import collections
from typing import Iterable, Tuple
class KoboldStoryRegister(collections.OrderedDict):
'''
Complexity-optimized class for keeping track of story chunks
'''
def __init__(self, sequence: Iterable[Tuple[int, str]] = ()):
super().__init__(sequence)
self.__next_id: int = len(sequence)
def append(self, v: str) -> None:
self[self.__next_id] = v
self.increment_id()
def pop(self) -> str:
return self.popitem()[1]
def get_first_key(self) -> int:
return next(iter(self))
def get_last_key(self) -> int:
return next(reversed(self))
def __getitem__(self, k: int) -> str:
return super().__getitem__(k)
def __setitem__(self, k: int, v: str) -> None:
return super().__setitem__(k, v)
def increment_id(self) -> None:
self.__next_id += 1
def get_next_id(self) -> int:
return self.__next_id
def set_next_id(self, x: int) -> None:
self.__next_id = x

View File

@ -13,6 +13,7 @@
<link rel="stylesheet" href="static/bootstrap.min.css">
<link rel="stylesheet" href="static/bootstrap-toggle.min.css">
<link rel="stylesheet" href="static/custom.css?ver=0.15.0g">
<link rel="stylesheet" href="static/open-iconic-bootstrap.min.css">
</head>
<body>
<div class="container">
@ -27,8 +28,12 @@
</button>
<div class="collapse navbar-collapse" id="navbarNavDropdown">
<ul class="nav navbar-nav">
<li class="nav-item">
<a class="nav-link" href="#" id="btn_newgame">New Story</a>
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">New Game</a>
<div class="dropdown-menu">
<a class="dropdown-item" href="#" id="btn_newgame">Blank Story</a>
<a class="dropdown-item" href="#" id="btn_rndgame">Random Story</a>
</div>
</li>
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Save</a>
@ -36,6 +41,8 @@
<a class="dropdown-item" href="#" id="btn_save">Save</a>
<a class="dropdown-item" href="#" id="btn_saveas">Save As</a>
<a class="dropdown-item" href="#" id="btn_savetofile">Save To File...</a>
<a class="dropdown-item" href="#" id="btn_download">Download Story as JSON</a>
<a class="dropdown-item" href="#" id="btn_downloadtxt">Download Story as Plaintext</a>
</div>
</li>
<li class="nav-item dropdown">
@ -72,31 +79,37 @@
</div>
<div class="row" id="formatmenu">
</div>
<div class="row" id="gamescreen">
<span id="gametext">...</span>
<div class="hidden" id="wimenu">
<div class="layer-container">
<div class="layer-bottom row" id="gamescreen">
<span id="gametext"><p>...</p></span>
<div class="hidden" id="wimenu">
</div>
</div>
<div id="curtain" class="layer-top row hidden"></div>
</div>
<div class="row" id="seqselmenu">
<div class="seqselheader">Select sequence to keep:</div>
<div id="seqselcontents">
</div>
</div>
<div class="row" id="actionmenu">
<div class="row flex" id="actionmenu">
<div id="actionmenuitems">
<div>
<button type="button" class="btn btn-primary" id="btn_actedit">Edit</button>
<button type="button" class="btn btn-primary" id="btn_actmem">Memory</button>
<button type="button" class="btn btn-primary" id="btn_actwi">W Info</button>
<button type="button" class="btn btn-primary" id="btn_actundo">Back</button>
<button type="button" class="btn btn-primary" id="btn_actretry">Retry</button>
<button type="button" class="btn btn-primary hidden" id="btn_delete">Delete</button>
<span id="messagefield"></span>
</div>
<button type="button" class="btn btn-primary" id="btn_actmem">Memory</button>
<button type="button" class="btn btn-primary" id="btn_actwi">W Info</button>
<button type="button" class="btn btn-primary" id="btn_actundo">Back</button>
<button type="button" class="btn btn-primary" id="btn_actretry">Retry</button>
</div>
<div id="messagefield"></div>
<div class="box flex-push-right">
<input type="checkbox" data-toggle="toggle" data-onstyle="success" id="allowediting" disabled>
<div class="box-label">Allow Editing</div>
</div>
</div>
<div class="row">
<div id="inputrow">
<div id="inputrowmode">
<button type="button" class="btn btn-secondary hidden" id="btnmode">Mode:<br/><b id="btnmode_label">Story</b></button>
</div>
<div id="inputrowleft">
<textarea class="form-control" id="input_text" placeholder="Enter text here"></textarea>
</div>
@ -182,7 +195,10 @@
<div class="aidgpopupcontent">
<input class="form-control" type="text" placeholder="Save Name" id="savename">
</div>
<div class="hidden" id="saveasoverwrite">
<div class="popuperror hidden">
<span></span>
</div>
<div class="saveasoverwrite hidden">
<span>File already exists. Really overwrite?</span>
</div>
<div class="popupfooter">
@ -198,7 +214,7 @@
</div>
<div class="loadlistheader">
<div>Save Name</div>
<div># Actions</div>
<div class="flex-push-right"># Actions</div>
</div>
<div id="loadlistcontent">
</div>
@ -208,6 +224,44 @@
</div>
</div>
</div>
<div class="popupcontainer hidden" id="loadcontainerdelete">
<div id="loadpopupdelete">
<div class="popuptitlebar">
<div class="popuptitletext">Really Delete Story?</div>
</div>
<div class="dialogheader">
"<span id="loadcontainerdelete-storyname"></span>" will be PERMANENTLY deleted! You will not be able to recover this story later.
</div>
<div class="popuperror hidden">
<span></span>
</div>
<div class="popupfooter">
<button type="button" class="btn btn-danger" id="btn_dsaccept">Delete</button>
<button type="button" class="btn btn-primary" id="btn_dsclose">Cancel</button>
</div>
</div>
</div>
<div class="popupcontainer hidden" id="loadcontainerrename">
<div id="loadpopuprename">
<div class="popuptitlebar">
<div class="popuptitletext">Enter New Name For Story</div>
</div>
<div class="dialogheader">
What should the story "<span id="loadcontainerrename-storyname"></span>" be renamed to?
<input class="form-control" type="text" placeholder="New Save Name" id="newsavename">
</div>
<div class="popuperror hidden">
<span></span>
</div>
<div class="saveasoverwrite hidden">
<span>File already exists. Really overwrite?</span>
</div>
<div class="popupfooter">
<button type="button" class="btn btn-primary" id="btn_rensaccept">Accept</button>
<button type="button" class="btn btn-primary" id="btn_rensclose">Cancel</button>
</div>
</div>
</div>
<div class="popupcontainer hidden" id="newgamecontainer">
<div id="nspopup">
<div class="popuptitlebar">
@ -222,5 +276,28 @@
</div>
</div>
</div>
<div class="popupcontainer hidden" id="rndgamecontainer">
<div id="rspopup">
<div class="popuptitlebar">
<div class="popuptitletext">Really Start A Random Story?</div>
</div>
<div class="aidgpopuplistheader">
<br>
Story quality and topic depends on the model and your settings/suggestion (Around 0.5 temp is recommended).<br>
This feature works best with finetuned models like GPT-Neo-AID or GPT-Neo-Horni but is limited to what the AI knows.<br>
If you get random spam then your model is not capable of using this feature and if you get unrelated stories it does not understand the topic.<br>
Generated results are unfiltered and can be offensive or unsuitable for children.<br><br>
Unsaved data will be lost.<br><br>
Below you can input a genre suggestion for the AI to loosely base the story on (For example Horror or Cowboy).<br>
</div>
<div class="aidgpopupcontent">
<input class="form-control" type="text" placeholder="Story Genre Suggestion (Leave blank for fully random)" id="topic">
</div>
<div class="popupfooter">
<button type="button" class="btn btn-primary" id="btn_rsaccept">Accept</button>
<button type="button" class="btn btn-primary" id="btn_rsclose">Cancel</button>
</div>
</div>
</div>
</body>
</html>
</html>

View File

@ -59,8 +59,11 @@ def replaceblanklines(txt):
#==================================================================#
#
#==================================================================#
def removespecialchars(txt):
txt = re.sub(r"[#/@%<>{}+=~|\^]", "", txt)
def removespecialchars(txt, vars=None):
if vars is None or vars.actionmode == 0:
txt = re.sub(r"[#/@%<>{}+=~|\^]", "", txt)
else:
txt = re.sub(r"[#/@%{}+=~|\^]", "", txt)
return txt
#==================================================================#
@ -69,8 +72,8 @@ def removespecialchars(txt):
def addsentencespacing(txt, vars):
# Get last character of last action
if(len(vars.actions) > 0):
if(len(vars.actions[-1]) > 0):
lastchar = vars.actions[-1][-1]
if(len(vars.actions[vars.actions.get_last_key()]) > 0):
lastchar = vars.actions[vars.actions.get_last_key()][-1]
else:
# Last action is blank, this should never happen, but
# since it did let's bail out.
@ -85,8 +88,8 @@ def addsentencespacing(txt, vars):
# Cleans string for use in file name
#==================================================================#
def cleanfilename(filename):
keepcharacters = (' ','.','_')
filename = "".join(c for c in filename if c.isalnum() or c in keepcharacters).rstrip()
filteredcharacters = ('/','\\')
filename = "".join(c for c in filename if c not in filteredcharacters).rstrip()
return filename