Commit Graph

50 Commits

Author SHA1 Message Date
Cohee
cebd6e9e0f Add API token ids from KoboldCpp 2023-12-14 01:28:18 +02:00
Cohee
9acef0fae6 Horde doesn't support API tokenizers 2023-12-10 16:21:06 +02:00
Cohee
f54bf99006 Fix token ids not displaying in "API_CURRENT" mode for TextGen 2023-12-10 16:09:00 +02:00
Cohee
6957d9e7cf Fix display names of Best match tokenizers 2023-12-10 16:03:25 +02:00
Cohee
6e5eea5dba Unbreak previously selected API tokenizer in dropdown 2023-12-10 15:56:38 +02:00
valadaptive
55976e61a3 Fix tokenizer override
I searched for all users of tokenizers.API, but missed that the menu
converts the numerical select values directly to enum values. I've used
the special tokenizer value 98 to represent "the tokenizer API for
whichever backend we're currently using".
2023-12-09 23:57:21 -05:00
valadaptive
014416546c Add padding once in getTokenCount
This means we don't have to pass the "padding" parameter into every
function so they can add the padding themselves--we can do it in just
one place instead.
2023-12-09 20:53:16 -05:00
valadaptive
2f2cd197cc Clean up tokenizer API code
Store the URLs for each tokenizer's action in one place at the top of
the file, instead of in a bunch of switch-cases. The URLs for the
textgen and Kobold APIs don't change and hence don't need to be
function arguments.
2023-12-09 20:48:41 -05:00
valadaptive
09465fbb97 Inline most get(...)TokenizerParams calls
For everything except textgenerationwebui, these params are now simple
enough that it doesn't make sense for them to be in a separate function.
2023-12-09 20:35:11 -05:00
valadaptive
30502ac949 Split up Kobold and textgenerationwebui endpoints
The endpoint was one big if/else statement that did two entirely
different things depending on the value of main_api. It makes more sense
for those to be two separate endpoints.
2023-12-09 20:26:24 -05:00
valadaptive
7486ab3886 Separate textgen and Kobold tokenization APIs
They function differently and have different logic and API parameters,
so it makes sense to count them as two different APIs. Kobold's API
doesn't return tokens, so it can only be used to count them.

There's still a lot of duplicate code which I will clean up in the
following commits.
2023-12-09 20:24:56 -05:00
valadaptive
18177c147d Separate remote and server tokenization code paths
This lets us remove extraneous API params from paths where they aren't
needed.
2023-12-09 20:08:48 -05:00
valadaptive
ddd73a204a Remove "remote" language from tokenizer functions
We'll be making a distinction between tokenizing *on* the server itself,
and tokenizing via the server having the AI service do it. It makes more
sense to use the term "remote" for the latter.
2023-12-09 19:49:22 -05:00
valadaptive
8bad059a62 Rename /tokenize_via_api endpoint
No redirect for this since I don't expect any extensions to be calling this directly.
2023-12-09 19:29:24 -05:00
valadaptive
57bc95133e Rename tokenizer routes
They're all under tokenizers/ now, and there are "count", "encode", and
"decode" endpoints. This forms a clearer hierarchy.
2023-12-04 10:17:43 -05:00
valadaptive
9c33ddbafc Make textgen settings type checks more concise 2023-12-03 14:56:01 -05:00
valadaptive
047c897ead Remove is[API] functions
Just use an equality comparison. It's a bit longer, but only because
"textgenerationwebui_settings" is a long identifier.
2023-12-03 14:56:01 -05:00
valadaptive
ba54e3dea0 Replaces is_[api] params with api_type param
These were 5 mutually-exclusive booleans, which can be replaced with one
param that takes on 5 values, one for each API type.
2023-12-03 14:56:01 -05:00
Cohee
64a3564892 lint: Comma dangle 2023-12-02 22:06:57 +02:00
Cohee
c63cd87cc0 lint: Require semicolons 2023-12-02 21:11:06 +02:00
valadaptive
a37f874e38 Require single quotes 2023-12-02 13:04:51 -05:00
Cohee
a367285ac2
Merge pull request #1430 from valadaptive/eslint-fixes-2
ESLint fixes, part 2 - bulky changes
2023-12-02 19:43:11 +02:00
Cohee
0477f6a553 Use best match API tokenizers for Text Completion sources 2023-12-02 19:42:15 +02:00
valadaptive
27e63a7a77 Enable no-case-declarations lint 2023-12-02 10:32:26 -05:00
Cohee
e6c96553d0 Add text trimming commands 2023-11-26 13:55:22 +02:00
Cohee
1ebfddf07e Use mistral and yi tokenizers for custom token bans 2023-11-21 01:04:27 +02:00
Cohee
9b75e49b54 Add support for Yi tokenizer 2023-11-21 00:21:58 +02:00
Cohee
96caddfd71 Add koboldcpp as Text Completion source 2023-11-19 17:14:53 +02:00
kingbri
4cfa267b1b API Tokenizer: Add support for TabbyAPI
Use Tabby's /v1/token endpoints.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-17 01:48:03 -05:00
Cohee
81fe9aa699 Fix updated tokenization via ooba API 2023-11-09 19:39:08 +02:00
Cohee
480099ee97 Mancer will work in legacy API mode. Remove Soft Prompt mentions. 2023-11-08 18:16:47 +02:00
Cohee
e76c18c104 Legacy ooba API compatibility shim 2023-11-08 10:13:28 +02:00
Cohee
865256f5c0 Fix ooba tokenization via API. Fix requiring streaming URL to generate 2023-11-08 03:38:04 +02:00
Cohee
57e845d0d7 Resolve best match tokenizer for itemization. Adjust styles of token counter 2023-11-06 20:25:59 +02:00
Cohee
e8ba328a14 Add text chunks display to token counter 2023-11-06 02:42:51 +02:00
Cohee
f248367ca3 Add Mistral tokenizer 2023-11-06 01:26:13 +02:00
Cohee
f0c0949aa0 Add token ids viewer to tokenizer plugin 2023-11-05 22:45:37 +02:00
Cohee
fedc3b887f Add llama2 tokenizer for OpenRouter models 2023-11-05 21:54:19 +02:00
Cohee
c2ba3a773a Delayed tokenizers initialization 2023-10-25 00:32:49 +03:00
Cohee
b167eb9e22 Add raw token ids support to OAI logit bias. Fix token counting for turbo models 2023-10-19 13:37:08 +03:00
Cohee
bfdd071001 Move tokenizer endpoint and functions to separate file 2023-09-16 18:48:06 +03:00
Cohee
853736fa93 Remove legacy NovelAI models 2023-09-06 14:32:06 +03:00
Cohee
267d0eb16f Fix API tokenizers usage with kcpp 2023-09-01 02:57:35 +03:00
Cohee
3b4e6f0b78 Add debug functions menu 2023-08-27 23:20:43 +03:00
Cohee
0844374de5 Remove old GPT-2 tokenizer. Redirect to tiktoken's tokenizer 2023-08-27 22:14:39 +03:00
Cohee
9660aaa2c2 Add NovelAI hypebot plugin 2023-08-27 18:27:34 +03:00
Cohee
c91ab3b5e0 Add Kobold tokenization to best match logic. Fix not being able to stop group chat regeneration 2023-08-24 21:23:35 +03:00
Cohee
ab52af4fb5 Add support for Koboldcpp tokenization endpoint 2023-08-24 20:19:57 +03:00
Cohee
e77da62b85 Add padding to cache key. Fix Safari display issues. Fix 400 on empty translate. Reset bias cache on changing model. 2023-08-23 10:32:48 +03:00
Cohee
bc5fc67906 Put tokenizer functions to a separate file. Cache local models token counts 2023-08-23 02:38:43 +03:00