Readme
Model used: Airoboros L2 70B 2.1 - GPTQ
LoRA used: Airoboros 2.1 Jannie 70B QLoRA
Inference using: ExLlama
Cog GitHub project: cog-exllama
Inference Airoboros L2 70B 2.1 - GPTQ using ExLlama.
This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 33 seconds. The predict time for this model varies significantly based on the inputs.
Model used: Airoboros L2 70B 2.1 - GPTQ
LoRA used: Airoboros 2.1 Jannie 70B QLoRA
Inference using: ExLlama
Cog GitHub project: cog-exllama