This is an attempt a creating a quantized cog for WizardCoder-15B to run on Replicate Note, using huggingface’s load_in_4bit=True
lucataco
/
wizardcoder-15b-v1
WizardLM/WizardCoder-15B-V1.0 in 4bit