Multigen
TavernAI tries to create longer responses by chaining the generation using smaller batches.
Algorithm:
1. If amount of generation is more than 50 tokens, then generate first 50 tokens.
2. Generate by 30 tokens until one of the stopping conditions is reached.
3. Append the generated batch to the next cycle's prompt.
Stopping conditions:
1. Generated enough text.
2. Character starts speaking for You.
3. <|endoftext|> token reached.
4. No text generated.