andreasjansson / llama-2-70b-chat-gguf

Llama-2 70B chat with support for grammars and jsonschema

  • Public
  • 1.9K runs
  • GitHub
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 45 seconds.

Readme

This model doesn't have a readme.