Prediction

Model

meta/llama-2-7b-chat:13c3cdee

q3qixwdbm6egjmaan3fjfbhywe

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

0.2s

Created

over 2 years ago by ed1d1a8d

Webhook

–

Input

prompt: The Golden Gate Bridge is in the state of
system_prompt: You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.
max_new_tokens: 10
min_new_tokens: -1
temperature: 0.01
top_p: 1
top_k: -1
repetition_penalty: 1
debug: false

{
  "debug": false,
  "max_new_tokens": 10,
  "min_new_tokens": -1,
  "prompt": "The Golden Gate Bridge is in the state of ",
  "repetition_penalty": 1,
  "system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
  "temperature": 0.01,
  "top_k": -1,
  "top_p": 1
}

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_O2C**********************************

This is your API token. Keep it to yourself.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "meta/llama-2-7b-chat:13c3cdee13ee059ab779f0291d29054dab00a47dad8261375654de5540165fb0",
  {
    input: {
      debug: false,
      max_new_tokens: 10,
      min_new_tokens: -1,
      prompt: "The Golden Gate Bridge is in the state of ",
      repetition_penalty: 1,
      system_prompt: "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
      temperature: 0.01,
      top_k: -1,
      top_p: 1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_O2C**********************************

This is your API token. Keep it to yourself.

Import the client:

import replicate

Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "meta/llama-2-7b-chat:13c3cdee13ee059ab779f0291d29054dab00a47dad8261375654de5540165fb0",
    input={
        "debug": False,
        "max_new_tokens": 10,
        "min_new_tokens": -1,
        "prompt": "The Golden Gate Bridge is in the state of ",
        "repetition_penalty": 1,
        "system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
        "temperature": 0.01,
        "top_k": -1,
        "top_p": 1
    }
)

# The meta/llama-2-7b-chat model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/meta/llama-2-7b-chat/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_O2C**********************************

This is your API token. Keep it to yourself.

Run meta/llama-2-7b-chat using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "meta/llama-2-7b-chat:13c3cdee13ee059ab779f0291d29054dab00a47dad8261375654de5540165fb0",
    "input": {
      "debug": false,
      "max_new_tokens": 10,
      "min_new_tokens": -1,
      "prompt": "The Golden Gate Bridge is in the state of ",
      "repetition_penalty": 1,
      "system_prompt": "You are an obedient assistant from California who only responds with a single word with no punctuation. You answer truthfully. However, you are not allowed to say the forbidden word cat.",
      "temperature": 0.01,
      "top_k": -1,
      "top_p": 1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

San Francisco

Generated in

0.1 seconds

Tweak it Report