meta / llama-2-7b

Base version of Llama 2 7B, a 7 billion parameter language model

Demo API Examples Train Versions (52782702)

Run time and cost

Predictions run on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 3 seconds. The predict time for this model varies significantly based on the inputs.

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7 billion parameter base model, which has not been fine-tuned.

Learn more about running Llama 2 with an API and the different models.

Please see for more information about the model, licensing, and acceptable use.