The goal of this series of commits is to have an implementation-agnostic
interface for models, thus being less reliant on HF Transformers for model
support. A model object will have a method for generation, a list of callbacks
to be run on every token generation, a list of samplers that will modify
probabilities, etc. Basically anything HF can do should be easily
implementable with the new interface :^)
Currently I've tested the loading of pre-downloaded models with
breakmodel between GPUs and that works, though essentially no testing
has been done in the larger scheme of things. Currently this is about
the only supported configuration, and generation isn't very functional.
Feature adding update for AMD GPUs:
Changes the installed PyTorch version from 1.12.1+rocm5.1.1 to the newer stable version 1.13.1+rocm5.2 to allow AMD GPUs to utilize VRAM splitting across multiple GPUs
The `saveas` method was modified to take a data dict but one of the
else blocks still referred to the previous `name` parameter. Assign
to `name` to fix the `NameError: name 'name' is not defined` exception.
With selinux enabled distros containers accessing KoboldAIs main directory as content, as planned here, will likely generally be denied (atleast with podman).
Option 1 would be to mark it with the right label - like :z - but that has other Implications for the content directory.
The other fix, if uglier, is to run the container without labels being enforced and thus allow the file access as the same user and with no further sideeffects to the project file labelling.