Official

meta / llama-2-13b-chat

A 13 billion parameter language model from Meta, fine tuned for chat completions

  • Public
  • 4.8M runs
  • Priced per token
  • GitHub
  • Paper
  • License

Input

*string
Shift + Return to add a new line

Prompt to send to the model.

string
Shift + Return to add a new line

System prompt to send to the model. This is prepended to the prompt and helps guide system behavior.

Default: "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."

integer
(minimum: 1)

Maximum number of tokens to generate. A word is generally 2-3 tokens.

Default: 512

integer
(minimum: -1)

Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.

number
(minimum: 0, maximum: 5)

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.

Default: 0.7

number
(minimum: 0, maximum: 1)

When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.

Default: 0.95

integer
(minimum: -1)

When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.

Default: 0

string
Shift + Return to add a new line

A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.

number
(minimum: 0, maximum: 5)

A parameter that controls how long the outputs are. If < 1, the model will tend to generate shorter outputs, and > 1 will tend to generate longer outputs.

Default: 1

number

A parameter that penalizes repeated tokens regardless of the number of appearances. As the value increases, the model will be less likely to repeat tokens in the output.

Default: 0

integer

Random seed. Leave blank to randomize the seed.

string
Shift + Return to add a new line

Template for formatting the prompt. Can be an arbitrary string, but must contain the substring `{prompt}`.

Default: "<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]"

boolean

Default: false

integer
(minimum: 1)

This parameter has been renamed to max_tokens. max_new_tokens only exists for backwards compatibility purposes. We recommend you use max_tokens instead. Both may not be specified.

integer
(minimum: -1)

This parameter has been renamed to min_tokens. min_new_tokens only exists for backwards compatibility purposes. We recommend you use min_tokens instead. Both may not be specified.

Output

Sure, I'd be happy to help! Here's a story in the style of James Joyce, set in the Irish countryside in 2083: As I stepped off the pod-train and onto the lush green grass of the countryside, the crisp air filled my lungs and invigorated my senses. The year was 2083, and yet the rolling hills and sparkling lakes of Ireland seemed unchanged by the passage of time. The only difference was the presence of robots, their sleek metallic bodies and glowing blue eyes a testament to the advancements of technology. I had come to this place seeking solace and inspiration, to lose myself in the beauty of nature and the wonder of machines. As I wandered through the hills, I came across a group of robots tending to a field of crops, their delicate movements and precise calculations ensuring a bountiful harvest. One of the robots, a sleek and agile model with wings like a dragonfly, fluttered over to me and offered a friendly greeting. "Good day, traveler," it said in a melodic voice. "What brings you to our humble abode?" I explained my desire to experience the beauty of the Irish countryside, and the robot nodded sympathetically. "Ah, yes," it said. "There is much to see and explore here. Would you like a guided tour?" I eagerly accepted the offer, and the robot led me on a journey through the rolling hills and sparkling lakes. We saw towering waterfalls and ancient ruins, and the robot shared stories of the history and culture of the land. As we walked, the sun began to set, casting a golden glow over the landscape. As the stars began to twinkle in the night sky, the robot and I sat down on a hill overlooking the countryside. "This is a special place," the robot said, its voice filled with a sense of wonder. "A place where nature and technology coexist in harmony." I nodded in agreement, feeling a sense of awe and gratitude for this wondrous place. And as I looked out at the stars, I knew that this trip to the
Generated in

Pricing

Official model
Pricing for official models works differently from other models. Instead of being billed by time, you’re billed by input and output, making pricing more predictable.

This language model is priced by how many input tokens are sent as inputs and how many output tokens are generated.

TypePer unitPer $1
Input
$0.10 / 1M tokens
or
10M tokens / $1
Output
$0.50 / 1M tokens
or
2M tokens / $1

For example, for $10 you can run around 28,571 predictions where the input is a sentence or two (15 tokens) and the output is a few paragraphs (700 tokens).

Check out our docs for more information about how per-token pricing works on Replicate.

Readme

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 13 billion parameter chat model, which has been fine-tuned on instructions to make it better at being a chat bot.

Learn more about running Llama 2 with an API and the different models.

Please see ai.meta.com/llama for more information about the model, licensing, and acceptable use.

How to prompt Llama 2 chat

To use this model, you can simply pass a prompt or instruction to the prompt argument. We handle prompt formatting on the backend so that you don’t need to worry about it.

Formatting prompts for chat interfaces

However, if you’re managing dialogue state with multiple exchanges between a user and the model, you need to mark the dialogue turns with instruction tags that indicate the beginning ("[INST]") and end (`”/INST]”) of user input. For example, a properly formatted dialogue looks like:

prompt = """\
[INST] Hi! [/INST]
Hello! How are you?
[INST] I'm great, thanks for asking. Could you help me with a task? [/INST]"""

In this example, the hypothetical user has first prompted "Hi!" and received the response "Hello! How are you?". Then, the user responded "I'm great, thanks for asking. Could you help me with a task?".

Modifying the system prompt

In addition to supporting dialogue exchanges, this deployment also allows you to modify the system prompt that is used to guide model responses. By altering the input to the system_prompt argument, you can inject custom context or information that will be used to guide model output.

To learn more, see this guide to prompting Llama 2.