Run time and cost

Predictions run on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 5 seconds.

CodeLlama is a family of fine-tuned Llama 2 models for coding. This is CodeLlama-13b, a 13 billion parameter Llama model tuned for code completion.