Commit Graph

25 Commits

Author SHA1 Message Date
Nick Perez
d8877b642d [gptq_hf_torch] Fix typo in model type check
`model_tseype` -> `model_type`
2023-09-23 00:53:48 -04:00
Henk
5917737676 Don't disable exllama 2023-08-21 13:17:30 +02:00
Henk
3dd0e91fbb Preliminary HF GPTQ changes 2023-08-21 01:58:52 +02:00
Henk
6f557befa9 GPTQ --revision support 2023-08-19 15:17:29 +02:00
Henk
d93631c889 GPTQ improvements 2023-08-19 14:45:45 +02:00
Henk
13b68c67d1 Basic GPTQ Downloader 2023-08-19 13:02:50 +02:00
Henk
e90903946d AutoGPTQ updates 2023-08-13 17:36:17 +02:00
Henk
2628726e1c Dont use exllama on fail 2023-08-10 19:34:08 +02:00
Henk
9c7ebe3b04 Better AutoGPTQ fallback 2023-08-10 18:10:48 +02:00
Henk
f2d7ef3aca AutoGPTQ breakmodel 2023-08-10 17:41:31 +02:00
Henk
54addfc234 AutoGPTQ fallback 2023-08-10 17:18:53 +02:00
somebody
c80de5120c Cleanup 2023-07-24 19:45:33 -05:00
somebody
ad4528b5a6 critical change 2023-07-24 17:17:57 -05:00
somebody
a73420c49c really really really sketchy breakmodel implementation
im gonna go lie down for an extended period of time
2023-07-24 17:15:59 -05:00
somebody
929917efe9 Remove shrieking 2023-07-24 13:09:43 -05:00
somebody
4a6cccb002 Import fix 2023-07-24 13:09:15 -05:00
somebody
a6aafb2525 GPTQ: Patch QuantLinear to not use CPU RAM 2023-07-24 13:07:30 -05:00
somebody
1df03d9a27 Basic 2023-07-23 20:54:04 -05:00
0cc4m
973aea12ea Only import big python modules for GPTQ once they get used 2023-07-23 22:07:34 +02:00
0cc4m
09bb1021dd Fallback to transformers if hf_bleeding_edge not available 2023-07-23 07:16:52 +02:00
0cc4m
748e5ef318 Add sliders for exllama context size and related methods 2023-07-23 07:11:28 +02:00
0cc4m
9aa6c5fbbf Merge upstream changes, fix conflict, adapt backends to changes 2023-07-19 06:56:09 +02:00
0cc4m
0001ae00ab Add v2 with bias support (e.g. for Tulu-30b) 2023-06-12 07:18:22 +02:00
0cc4m
12df8220fb Add gpt_bigcode support, fix 8-bit GPTQ incoherence 2023-06-12 07:14:36 +02:00
0cc4m
c82625490a Rename gptq backend folder 2023-06-04 12:31:24 +02:00