Readme
This model doesn't have a readme.
This model is the Open-Assistant fine-tuning of Meta's Llama2 70B LLM.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run nwhitehead/llama2-70b-oasst-sft-v10 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"nwhitehead/llama2-70b-oasst-sft-v10:06b67465edf0e1c2e98524102f7edb0ecb0f2d2223fd44494af8b5cc615241d9",
input={
"seed": -1,
"top_k": 20,
"top_p": 1,
"prompt": "USER: Hello, who are you?\nASSISTANT:",
"max_tokens": 50,
"min_tokens": 1,
"temperature": 0.5,
"repetition_penalty": 1
}
)
# The nwhitehead/llama2-70b-oasst-sft-v10 model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
# https://replicate.com/nwhitehead/llama2-70b-oasst-sft-v10/api#output-schema
print(item, end="")
To learn more, take a look at the guide on getting started with Python.
No output yet! Press "Submit" to start a prediction.
This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.
This model doesn't have a readme.