Llama-2 70B chat with support for grammars and jsonschema
This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 45 seconds.
This model doesn't have a readme.