andreasjansson / llama-2-13b-chat-gguf
Llama-2 13B chat with support for grammars and jsonschema
Run time and cost
This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 3 seconds.
Llama-2 13B chat with support for grammars and jsonschema
This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 3 seconds.