Prediction

Model

lucataco/dolphin-2.9-llama3-8b:ee173688

e0x4vq6bs1rgj0cgdw1byd0ywm

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

53.8s

Created

over 1 year ago

Webhook

–

Input

prompt: Who are you?
system_prompt: The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.
min_tokens: 0
max_tokens: 512
temperature: 0.6
top_p: 0.9
top_k: 50
presence_penalty: 0
frequency_penalty: 0

{
  "frequency_penalty": 0,
  "max_tokens": 512,
  "min_tokens": 0,
  "presence_penalty": 0,
  "prompt": "Who are you?",
  "system_prompt": "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.",
  "temperature": 0.6,
  "top_k": 50,
  "top_p": 0.9
}

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_7uw**********************************

This is your API token. Keep it to yourself.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run lucataco/dolphin-2.9-llama3-8b using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "lucataco/dolphin-2.9-llama3-8b:ee173688d3b8d9e05a5b910f10fb9bab1e9348963ab224579bb90d9fce3fb00b",
  {
    input: {
      frequency_penalty: 0,
      max_tokens: 512,
      min_tokens: 0,
      presence_penalty: 0,
      prompt: "Who are you?",
      system_prompt: "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.",
      temperature: 0.6,
      top_k: 50,
      top_p: 0.9
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_7uw**********************************

This is your API token. Keep it to yourself.

Import the client:

import replicate

Run lucataco/dolphin-2.9-llama3-8b using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "lucataco/dolphin-2.9-llama3-8b:ee173688d3b8d9e05a5b910f10fb9bab1e9348963ab224579bb90d9fce3fb00b",
    input={
        "frequency_penalty": 0,
        "max_tokens": 512,
        "min_tokens": 0,
        "presence_penalty": 0,
        "prompt": "Who are you?",
        "system_prompt": "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.",
        "temperature": 0.6,
        "top_k": 50,
        "top_p": 0.9
    }
)

# The lucataco/dolphin-2.9-llama3-8b model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/lucataco/dolphin-2.9-llama3-8b/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_7uw**********************************

This is your API token. Keep it to yourself.

Run lucataco/dolphin-2.9-llama3-8b using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "lucataco/dolphin-2.9-llama3-8b:ee173688d3b8d9e05a5b910f10fb9bab1e9348963ab224579bb90d9fce3fb00b",
    "input": {
      "frequency_penalty": 0,
      "max_tokens": 512,
      "min_tokens": 0,
      "presence_penalty": 0,
      "prompt": "Who are you?",
      "system_prompt": "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.",
      "temperature": 0.6,
      "top_k": 50,
      "top_p": 0.9
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

I am Dolphin, a helpful and friendly AI assistant. How can I assist you today?

{
  "id": "e0x4vq6bs1rgj0cgdw1byd0ywm",
  "model": "lucataco/dolphin-2.9-llama3-8b",
  "version": "ee173688d3b8d9e05a5b910f10fb9bab1e9348963ab224579bb90d9fce3fb00b",
  "input": {
    "frequency_penalty": 0,
    "max_tokens": 512,
    "min_tokens": 0,
    "presence_penalty": 0,
    "prompt": "Who are you?",
    "system_prompt": "The assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.",
    "temperature": 0.6,
    "top_k": 50,
    "top_p": 0.9
  },
  "logs": "INFO 07-01 15:57:05 async_llm_engine.py:529] Received request 373796f1566c422ea37a6134ddbdb853: prompt: '<|im_start|>system\\nThe assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.<|im_end|>\\n<|im_start|>user\\nWho are you?<|im_end|>\\n<|im_start|>assistant\\n', sampling_params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.6, top_p=0.9, top_k=50, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[128256], include_stop_str_in_output=False, ignore_eos=False, max_tokens=512, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: None, lora_request: None.\n stdoutGeneration took 1719849348.36sFormatted prompt: <|im_start|>system\nThe assistant is named Dolphin. A helpful and friendly AI assistant, Dolphin avoids discussing the system message unless directly asked about it.<|im_end|>\n<|im_start|>user\nWho are you?<|im_end|>\n<|im_start|>assistant\nINFO 07-01 15:57:06 async_llm_engine.py:120] Finished request 373796f1566c422ea37a6134ddbdb853.\n stdout",
  "output": [
    "I",
    " am",
    " Dolphin",
    ",",
    " a",
    " helpful",
    " and",
    " friendly",
    " AI",
    " assistant",
    ".",
    " How",
    " can",
    " I",
    " assist",
    " you",
    " today",
    "?",
    ""
  ],
  "data_removed": false,
  "error": null,
  "source": "web",
  "status": "succeeded",
  "created_at": "2024-07-01T15:56:12.616Z",
  "started_at": "2024-07-01T15:57:05.806557Z",
  "completed_at": "2024-07-01T15:57:06.365551Z",
  "urls": {
    "cancel": "https://api.replicate.com/v1/predictions/e0x4vq6bs1rgj0cgdw1byd0ywm/cancel",
    "get": "https://api.replicate.com/v1/predictions/e0x4vq6bs1rgj0cgdw1byd0ywm",
    "stream": "https://streaming-api.svc.us.c.replicate.net/v1/streams/mj7ysyy3xr3austgjgjmtzzdf2cg4wyt3swt53cvjjoshi54tdra",
    "web": "https://replicate.com/p/e0x4vq6bs1rgj0cgdw1byd0ywm"
  },
  "metrics": {
    "predict_time": 0.558993931,
    "total_time": 53.749551
  }
}

Generated in

0.6 seconds

Tweak it Report