Prediction

nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc

Model

nateraw/llama-2-7b-paraphrase-v1:17b76fbd

ID

uuys223ba4qjmni5ejfz6u5fjq

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

12.2s

Created

over 1 year ago

Input

debug
top_k: 50
top_p: 0.9
prompt: The capital of France is Paris
temperature: 0.75
max_new_tokens: 128
min_new_tokens: -1

{
  "debug": false,
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "The capital of France is Paris\n",
  "temperature": 0.75,
  "max_new_tokens": 128,
  "min_new_tokens": -1
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
  {
    input: {
      debug: false,
      top_p: 0.9,
      prompt: "The capital of France is Paris\n",
      temperature: 0.75,
      max_new_tokens: 128,
      min_new_tokens: -1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    input={
        "debug": False,
        "top_p": 0.9,
        "prompt": "The capital of France is Paris\n",
        "temperature": 0.75,
        "max_new_tokens": 128,
        "min_new_tokens": -1
    }
)

# The nateraw/llama-2-7b-paraphrase-v1 model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/nateraw/llama-2-7b-paraphrase-v1/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    "input": {
      "debug": false,
      "top_p": 0.9,
      "prompt": "The capital of France is Paris\\n",
      "temperature": 0.75,
      "max_new_tokens": 128,
      "min_new_tokens": -1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

The capital of France is located in Paris.

{
  "completed_at": "2023-11-14T23:28:48.136324Z",
  "created_at": "2023-11-14T23:28:35.938654Z",
  "data_removed": false,
  "error": null,
  "id": "uuys223ba4qjmni5ejfz6u5fjq",
  "input": {
    "debug": false,
    "top_k": 50,
    "top_p": 0.9,
    "prompt": "The capital of France is Paris\n",
    "temperature": 0.75,
    "max_new_tokens": 128,
    "min_new_tokens": -1
  },
  "logs": "Your formatted prompt is:\nThe capital of France is Paris\nprevious weights were different, switching to https://replicate.delivery/pbxt/TVlCb4hSte1nUqEMOrIjrcnitL1WlK3l5vOzb99Voz2fOb4RA/training_output.zip\nDownloading peft weights\nusing https://replicate.delivery/pbxt/TVlCb4hSte1nUqEMOrIjrcnitL1WlK3l5vOzb99Voz2fOb4RA/training_output.zip instead of https://replicate.delivery/pbxt/TVlCb4hSte1nUqEMOrIjrcnitL1WlK3l5vOzb99Voz2fOb4RA/training_output.zip\nDownloaded training_output.zip as 10 824 kB chunks in 0.3440 with 0 retries\nDownloaded peft weights in 0.344\nUnzipped peft weights in 0.010\nInitialized peft model in 0.006\nOverall initialize_peft took 9.265\nExllama: False\nINFO 11-14 23:28:47 async_llm_engine.py:371] Received request 0: prompt: 'The capital of France is Paris\\n', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.75, top_p=0.9, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None.\nINFO 11-14 23:28:47 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%\nINFO 11-14 23:28:48 async_llm_engine.py:111] Finished request 0.\nhostname: model-hs-73001d65-da72d39bf79629ac-gpu-a40-85ddd9ccc9-97bzw",
  "metrics": {
    "predict_time": 9.586471,
    "total_time": 12.19767
  },
  "output": [
    "The",
    " capital",
    " of",
    " France",
    " is",
    " located",
    " in",
    " Paris",
    ".",
    ""
  ],
  "started_at": "2023-11-14T23:28:38.549853Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/uuys223ba4qjmni5ejfz6u5fjq",
    "cancel": "https://api.replicate.com/v1/predictions/uuys223ba4qjmni5ejfz6u5fjq/cancel"
  },
  "version": "17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc"
}

Generated in

9.6 seconds

Tweak it Share Report

Prediction

nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc

Model

nateraw/llama-2-7b-paraphrase-v1:17b76fbd

ID

c3qq3rdb3fpwizk6x6sfq62h4q

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

1.6s

Created

over 1 year ago

Input

debug
top_k: 50
top_p: 0.9
prompt: I'm going to order some food. Do you want some?
temperature: 0.75
max_new_tokens: 128
min_new_tokens: -1

{
  "debug": false,
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "I'm going to order some food. Do you want some?\n",
  "temperature": 0.75,
  "max_new_tokens": 128,
  "min_new_tokens": -1
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
  {
    input: {
      debug: false,
      top_p: 0.9,
      prompt: "I'm going to order some food. Do you want some?\n",
      temperature: 0.75,
      max_new_tokens: 128,
      min_new_tokens: -1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    input={
        "debug": False,
        "top_p": 0.9,
        "prompt": "I'm going to order some food. Do you want some?\n",
        "temperature": 0.75,
        "max_new_tokens": 128,
        "min_new_tokens": -1
    }
)

# The nateraw/llama-2-7b-paraphrase-v1 model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/nateraw/llama-2-7b-paraphrase-v1/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    "input": {
      "debug": false,
      "top_p": 0.9,
      "prompt": "I\'m going to order some food. Do you want some?\\n",
      "temperature": 0.75,
      "max_new_tokens": 128,
      "min_new_tokens": -1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

I'm going to order some food. Would you like to join me?

Generated in

0.4 seconds

Tweak it Share Report

Prediction

nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc

Model

nateraw/llama-2-7b-paraphrase-v1:17b76fbd

ID

ullqrudbck3zk5l5fsuu7ayliq

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

8.8s

Created

over 1 year ago

Input

debug
top_k: 50
top_p: 0.9
prompt: My favorite color is red, but I also like black.
temperature: 0.75
max_new_tokens: 128
min_new_tokens: -1

{
  "debug": false,
  "top_k": 50,
  "top_p": 0.9,
  "prompt": "My favorite color is red, but I also like black.\n",
  "temperature": 0.75,
  "max_new_tokens": 128,
  "min_new_tokens": -1
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
  {
    input: {
      debug: false,
      top_p: 0.9,
      prompt: "My favorite color is red, but I also like black.\n",
      temperature: 0.75,
      max_new_tokens: 128,
      min_new_tokens: -1
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    input={
        "debug": False,
        "top_p": 0.9,
        "prompt": "My favorite color is red, but I also like black.\n",
        "temperature": 0.75,
        "max_new_tokens": 128,
        "min_new_tokens": -1
    }
)

# The nateraw/llama-2-7b-paraphrase-v1 model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/nateraw/llama-2-7b-paraphrase-v1/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run nateraw/llama-2-7b-paraphrase-v1 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "nateraw/llama-2-7b-paraphrase-v1:17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc",
    "input": {
      "debug": false,
      "top_p": 0.9,
      "prompt": "My favorite color is red, but I also like black.\\n",
      "temperature": 0.75,
      "max_new_tokens": 128,
      "min_new_tokens": -1
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

Red is my favorite color, but I also have a soft spot for black.

{
  "completed_at": "2023-11-14T23:30:18.379928Z",
  "created_at": "2023-11-14T23:30:09.612573Z",
  "data_removed": false,
  "error": null,
  "id": "ullqrudbck3zk5l5fsuu7ayliq",
  "input": {
    "debug": false,
    "top_k": 50,
    "top_p": 0.9,
    "prompt": "My favorite color is red, but I also like black.\n",
    "temperature": 0.75,
    "max_new_tokens": 128,
    "min_new_tokens": -1
  },
  "logs": "Your formatted prompt is:\nMy favorite color is red, but I also like black.\ncorrect lora is already loaded\nOverall initialize_peft took 0.000\nExllama: False\nINFO 11-14 23:30:17 async_llm_engine.py:371] Received request 0: prompt: 'My favorite color is red, but I also like black.\\n', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.75, top_p=0.9, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None.\nINFO 11-14 23:30:17 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%\nINFO 11-14 23:30:18 async_llm_engine.py:111] Finished request 0.\nhostname: model-hs-73001d65-da72d39bf79629ac-gpu-a40-85ddd9ccc9-97bzw",
  "metrics": {
    "predict_time": 0.414644,
    "total_time": 8.767355
  },
  "output": [
    "Red",
    " is",
    " my",
    " favorite",
    " color",
    ",",
    " but",
    " I",
    " also",
    " have",
    " a",
    " soft",
    " spot",
    " for",
    " black",
    ".",
    ""
  ],
  "started_at": "2023-11-14T23:30:17.965284Z",
  "status": "succeeded",
  "urls": {
    "get": "https://api.replicate.com/v1/predictions/ullqrudbck3zk5l5fsuu7ayliq",
    "cancel": "https://api.replicate.com/v1/predictions/ullqrudbck3zk5l5fsuu7ayliq/cancel"
  },
  "version": "17b76fbd699fcd4476b6d7292de8bfd1ee1b219f7ed81f0395da0631d00850cc"
}

Generated in

0.4 seconds

Tweak it Share Report