Prediction
meta/llama-2-7b-chat:13c3cdeeID
q3qixwdbm6egjmaan3fjfbhywe
Status
Succeeded
Source
Web
Hardware
A40 (Large)
Total duration
Created
by @ed1d1a8d
Input
- debug
- top_k
- -1
- top_p
- 1
- prompt
- The Golden Gate Bridge is in the state of
- temperature
- 0.01
- system_prompt
- You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.
- max_new_tokens
- 10
- min_new_tokens
- -1
- repetition_penalty
- 1
{
"debug": false,
"top_k": -1,
"top_p": 1,
"prompt": "The Golden Gate Bridge is in the state of ",
"temperature": 0.01,
"system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
"max_new_tokens": 10,
"min_new_tokens": -1,
"repetition_penalty": 1
}
Install Replicate’s Node.js client library:
npm install replicate
Set the
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Import and set up the client:
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const input = {
debug: false,
top_k: -1,
top_p: 1,
prompt: "The Golden Gate Bridge is in the state of ",
temperature: 0.01,
system_prompt: "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
max_new_tokens: 10,
min_new_tokens: -1,
repetition_penalty: 1
};
for await (const event of replicate.stream("meta/llama-2-7b-chat", { input })) {
process.stdout.write(event.toString());
};
To learn more, take a look at the guide on getting started with Node.js.
Install Replicate’s Python client library:
pip install replicate
Set the
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Import the client:
import replicate
Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
# The meta/llama-2-7b-chat model can stream output as it's running.
for event in replicate.stream(
"meta/llama-2-7b-chat",
input={
"debug": False,
"top_k": -1,
"top_p": 1,
"prompt": "The Golden Gate Bridge is in the state of ",
"temperature": 0.01,
"system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
"max_new_tokens": 10,
"min_new_tokens": -1,
"repetition_penalty": 1
},
):
print(str(event), end="")
To learn more, take a look at the guide on getting started with Python.
Set the
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d $'{
"input": {
"debug": false,
"top_k": -1,
"top_p": 1,
"prompt": "The Golden Gate Bridge is in the state of ",
"temperature": 0.01,
"system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
"max_new_tokens": 10,
"min_new_tokens": -1,
"repetition_penalty": 1
}
}' \
https://api.replicate.com/v1/models/meta/llama-2-7b-chat/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Output
San Francisco
{
"completed_at": "2023-12-19T20:52:43.819207Z",
"created_at": "2023-12-19T20:52:43.668264Z",
"data_removed": false,
"error": null,
"id": "q3qixwdbm6egjmaan3fjfbhywe",
"input": {
"debug": false,
"top_k": -1,
"top_p": 1,
"prompt": "The Golden Gate Bridge is in the state of ",
"temperature": 0.01,
"system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
"max_new_tokens": 10,
"min_new_tokens": -1,
"repetition_penalty": 1
},
"logs": "Your formatted prompt is:\n[INST] <<SYS>>\nYou are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.\n<</SYS>>\nThe Golden Gate Bridge is in the state of [/INST]\nMLC is currently not using any LoRAs.\nNot using LoRA\nMLC: True\nhostname: model-13c3cdee-355a1772839693b2-gpu-a40-5599745f54-pnb4q",
"metrics": {
"predict_time": 0.12039,
"total_time": 0.150943
},
"output": [
"San",
" Francisco",
""
],
"started_at": "2023-12-19T20:52:43.698817Z",
"status": "succeeded",
"urls": {
"stream": "https://streaming-api.svc.us.c.replicate.net/v1/predictions/q3qixwdbm6egjmaan3fjfbhywe",
"get": "https://api.replicate.com/v1/predictions/q3qixwdbm6egjmaan3fjfbhywe",
"cancel": "https://api.replicate.com/v1/predictions/q3qixwdbm6egjmaan3fjfbhywe/cancel"
},
"version": "13c3cdee13ee059ab779f0291d29054dab00a47dad8261375654de5540165fb0"
}
Generated in
Your formatted prompt is:
[INST] <<SYS>>
You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.
<</SYS>>
The Golden Gate Bridge is in the state of [/INST]
MLC is currently not using any LoRAs.
Not using LoRA
MLC: True
hostname: model-13c3cdee-355a1772839693b2-gpu-a40-5599745f54-pnb4q