Failed to load versions. Head to the versions page to see all versions for this model.
You're looking at a specific version of this model. Jump to the model overview.
technillogue /llama-2-7b-chat-hf-mlc:147a3c83
Input
Run this model in Node.js with one line of code:
npm install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
Run technillogue/llama-2-7b-chat-hf-mlc using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run(
"technillogue/llama-2-7b-chat-hf-mlc:147a3c83e93b12e16eda7f7d9b4ed84f71257b7c0e9b25323e33c754c8834e38",
{
input: {
debug: false,
top_p: 0.95,
temperature: 0.7,
system_prompt: "You are a helpful, respectful and honest assistant.",
max_new_tokens: 128,
min_new_tokens: -1,
prompt_template: "[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]",
repetition_penalty: 1.15,
disable_mlc_formatting: false
}
}
);
console.log(output);
To learn more, take a look at the guide on getting started with Node.js.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run technillogue/llama-2-7b-chat-hf-mlc using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"technillogue/llama-2-7b-chat-hf-mlc:147a3c83e93b12e16eda7f7d9b4ed84f71257b7c0e9b25323e33c754c8834e38",
input={
"debug": False,
"top_p": 0.95,
"temperature": 0.7,
"system_prompt": "You are a helpful, respectful and honest assistant.",
"max_new_tokens": 128,
"min_new_tokens": -1,
"prompt_template": "[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]",
"repetition_penalty": 1.15,
"disable_mlc_formatting": False
}
)
# The technillogue/llama-2-7b-chat-hf-mlc model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
# https://replicate.com/technillogue/llama-2-7b-chat-hf-mlc/api#output-schema
print(item, end="")
To learn more, take a look at the guide on getting started with Python.
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Run technillogue/llama-2-7b-chat-hf-mlc using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d $'{
"version": "technillogue/llama-2-7b-chat-hf-mlc:147a3c83e93b12e16eda7f7d9b4ed84f71257b7c0e9b25323e33c754c8834e38",
"input": {
"debug": false,
"top_p": 0.95,
"temperature": 0.7,
"system_prompt": "You are a helpful, respectful and honest assistant.",
"max_new_tokens": 128,
"min_new_tokens": -1,
"prompt_template": "[INST] <<SYS>>\\n{system_prompt}\\n<</SYS>>\\n\\n{prompt} [/INST]",
"repetition_penalty": 1.15,
"disable_mlc_formatting": false
}
}' \
https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
Add a payment method to run this model.
By signing in, you agree to our
terms of service and privacy policy
Output
No output yet! Press "Submit" to start a prediction.