lucataco / llama-2-70b-chat

Meta's Llama 2 70b Chat - GPTQ

Demo API Examples README Versions (77a44e00)

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 5 minutes. The predict time for this model varies significantly based on the inputs.