typetext
{
"frequency_penalty": 0,
"max_tokens": 1024,
"min_tokens": 0,
"presence_penalty": 0,
"prompt": "Tina has one brother and one sister. How many sisters do Tina's siblings have?",
"system_prompt": "You are a helpful assistant.",
"temperature": 0.6,
"top_p": 0.9
}npm install replicate
REPLICATE_API_TOKEN environment variable:export REPLICATE_API_TOKEN=r8_Wvf**********************************
This is your API token. Keep it to yourself.
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
Run meta/meta-llama-3.1-405b-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const input = {
frequency_penalty: 0,
max_tokens: 1024,
min_tokens: 0,
presence_penalty: 0,
prompt: "Tina has one brother and one sister. How many sisters do Tina's siblings have?",
system_prompt: "You are a helpful assistant.",
temperature: 0.6,
top_p: 0.9
};
for await (const event of replicate.stream("meta/meta-llama-3.1-405b-instruct", { input })) {
process.stdout.write(event.toString());
};
To learn more, take a look at the guide on getting started with Node.js.
pip install replicate
REPLICATE_API_TOKEN environment variable:export REPLICATE_API_TOKEN=r8_Wvf**********************************
This is your API token. Keep it to yourself.
import replicate
Run meta/meta-llama-3.1-405b-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
# The meta/meta-llama-3.1-405b-instruct model can stream output as it's running.
for event in replicate.stream(
"meta/meta-llama-3.1-405b-instruct",
input={
"frequency_penalty": 0,
"max_tokens": 1024,
"min_tokens": 0,
"presence_penalty": 0,
"prompt": "Tina has one brother and one sister. How many sisters do Tina's siblings have?",
"system_prompt": "You are a helpful assistant.",
"temperature": 0.6,
"top_p": 0.9
},
):
print(str(event), end="")
To learn more, take a look at the guide on getting started with Python.
REPLICATE_API_TOKEN environment variable:export REPLICATE_API_TOKEN=r8_Wvf**********************************
This is your API token. Keep it to yourself.
Run meta/meta-llama-3.1-405b-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d $'{
"input": {
"frequency_penalty": 0,
"max_tokens": 1024,
"min_tokens": 0,
"presence_penalty": 0,
"prompt": "Tina has one brother and one sister. How many sisters do Tina\'s siblings have?",
"system_prompt": "You are a helpful assistant.",
"temperature": 0.6,
"top_p": 0.9
}
}' \
https://api.replicate.com/v1/models/meta/meta-llama-3.1-405b-instruct/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Tina has one brother and one sister. From the brother's perspective, he has one sister, Tina, and also the other sister. So, Tina's brother has 2 sisters. From the sister's perspective, she also has one sister, Tina. So, Tina's siblings have a total of 2 sisters (from the brother's perspective) and 1 sister (from the sister's perspective).
{
"id": "zjq6szssxdrm20cgw0fsntgvbg",
"model": "meta/meta-llama-3.1-405b-instruct",
"version": "hidden",
"input": {
"frequency_penalty": 0,
"max_tokens": 1024,
"min_tokens": 0,
"presence_penalty": 0,
"prompt": "Tina has one brother and one sister. How many sisters do Tina's siblings have?",
"system_prompt": "You are a helpful assistant.",
"temperature": 0.6,
"top_p": 0.9
},
"logs": "INFO 07-23 15:04:21 async_llm_engine.py:173] Added request 3cd670c9b87b469ea9721ddfbcc6b67a.\nINFO 07-23 15:04:21 metrics.py:396] Avg prompt throughput: 7.8 tokens/s, Avg generation throughput: 17.7 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.3%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:04:24 async_llm_engine.py:140] Finished request 3cd670c9b87b469ea9721ddfbcc6b67a.\nGeneration took 1721746720.08sFormatted prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nTina has one brother and one sister. How many sisters do Tina's siblings have?<|eot_id|><|start_header_id|>assistant<|end_header_id|>",
"output": [
"T",
"ina",
" has",
" one",
" brother",
" and",
" one",
" sister",
".",
" From",
" the",
" brother",
"'s",
" perspective",
",",
" he",
" has",
" one",
" sister",
",",
" Tina",
",",
" and",
" also",
" the",
" other",
" sister",
".",
" So",
",",
" Tina",
"'s",
" brother",
" has",
" ",
"2",
" sisters",
".\n\n",
"From",
" the",
" sister",
"'s",
" perspective",
",",
" she",
" also",
" has",
" one",
" sister",
",",
" Tina",
".",
" \n\n",
"So",
",",
" Tina",
"'s",
" siblings",
" have",
" a",
" total",
" of",
" ",
"2",
" sisters",
" (",
"from",
" the",
" brother",
"'s",
" perspective",
")",
" and",
" ",
"1",
" sister",
" (",
"from",
" the",
" sister",
"'s",
" perspective",
").",
""
],
"data_removed": false,
"error": null,
"source": "web",
"status": "succeeded",
"created_at": "2024-07-23T15:04:21.227Z",
"started_at": "2024-07-23T15:04:21.263665Z",
"completed_at": "2024-07-23T15:04:24.09009Z",
"urls": {
"cancel": "https://api.replicate.com/v1/predictions/zjq6szssxdrm20cgw0fsntgvbg/cancel",
"get": "https://api.replicate.com/v1/predictions/zjq6szssxdrm20cgw0fsntgvbg",
"stream": "https://streaming-api.svc.sea.v.replicate.net/v1/streams/6zoqtc6qgpxby7xzauee36qv6bktfaxugelxfqs6ifgg6q4s77sa",
"web": "https://replicate.com/p/zjq6szssxdrm20cgw0fsntgvbg"
},
"metrics": {
"batch_size": 1.000229398266064,
"input_token_count": 26,
"output_token_count": 84,
"predict_time": 2.826425468,
"predict_time_share": 2.8200430870056152,
"time_to_first_token": 0.112068943,
"tokens_per_second": 29.74134035953616,
"total_time": 2.86309
}
}