Yahooo, decoupling from koboldai_vars. This makes the generation test
pass in `test_generation.py`, and makes full determinism outside of
core_generate work.
We now do some basic tests for:
- hf torch loading (normal, lazy, lowmem)
- hf torch generation (shape batches, shape tokencount, faulty
determinism)
Currently full determinism is failing; yahoo, the tests work!
All of the tests initally failed (note the test environment functions
different from the aiserver environment due to aiserver doing a lot of
initalizing stuff, working on phasing that out) but now only one fails.
Very useful for finding bugs!