joehoover / zephyr-7b-alpha

A high-performing language model trained to act as a helpful assistant

  • Public
  • 8K runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.0045 to run on Replicate, or 222 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 5 seconds.

Readme

Zephyr 7B Alpha is the first of a series of language models developed by the Hugging Face H4 RLHF team. It is a fine-tuned version of mistralai/Mistral-7B-v0.1 that has been optimized with both supervised fine-tuning and RLHF (reinforcement learning with human feedback).

Please see the Hugging Face model card for more information about the model, licensing, and acceptable use.

How to prompt Zephyr 7B Alpha

To use this model, you can simply pass a prompt or instruction to the prompt argument. We handle prompt formatting on the backend so that you don’t need to worry about it. But, for reference, the prompt format for this model is:

 """<|system|>
{system_prompt}</s>
<|user|>
{instruction}</s>
<|assistant|>
"""

Where {system_prompt} is an optionally user-specified system prompt and {instruction} is the user input.

Formatting prompts for chat interfaces

However, if you’re managing dialogue state with multiple exchanges between a user and the model, you need to mark the dialogue turns with tags that indicate the beginning and end of user input. For example, dialogue formatting might proceed like:

  • system_prompt is set to "You are a helpful assistant."
  • User inputs "Can you help me answer a question?" and input is passed to the Replicate API.
  • Internally, the user input will be injected into the prompt template, like:
 """<|system|>
You are a helpful assistant.</s>
<|user|>
Can you help me answer a question?</s>
<|assistant|>
"""
  • The model might respond with:
"I'd be happy to help you answer any question you have. Please provide me with the question you'd like assistance with, and I'll do my best to provide you with an answer."
  • Then, the user might respond with "Please help understand this riddle: \"I’m tall when I’m young, and I’m short when I’m old. What am I?\"".

  • In this case, the next input to the model should be formatted like:

 """
Can you help me answer a question?</s>
<|assistant|>
I'd be happy to help you answer any question you have. Please provide me with the question you'd like assistance with, and I'll do my best to provide you with an answer.</s>
<|user|>
Please help understand this riddle: "I’m tall when I’m young, and I’m short when I’m old. What am I?"
"""
  • Then, the model might respond with something like:
"Certainly! The answer to this riddle is a \"candle\". When a candle is young, it's tall, but as it burns and gets shorter, it becomes shorter and shorter until it eventually extinguishes, at which point it's no longer \"tall\" or \"short\"."

Modifying the system prompt

In addition to supporting dialogue exchanges, this deployment also allows you to modify the system prompt that is used to guide model responses. By altering the input to the system_prompt argument, you can inject custom context or information that will be used to guide model output.