andreasjansson / llama-2-13b-gguf

Llama-2 13B with support for grammars and jsonschema

  • Public
  • 653 runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 6 seconds. The predict time for this model varies significantly based on the inputs.

Readme

This model doesn't have a readme.