Prediction

Model

moinnadeem/codellama-34b-instruct-vllm:c97256d5

dlmoz6lb6vjarv2ydqiecszm2e

Status

Succeeded

Source

Web

Hardware

A100 (80GB)

Total duration

2m 14s

Created

about 2 years ago by moinnadeem

Webhook

–

Input

prompt: Write a Python function downloads a html page from the internet and extracts the text of the h1 elements
system_prompt: You are a helpful assistant.
max_new_tokens: 509
min_new_tokens: -1
temperature: 0.75
top_p: 0.9
top_k: 50
debug: false

{
  "debug": false,
  "max_new_tokens": 509,
  "min_new_tokens": -1,
  "prompt": "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
  "system_prompt": "You are a helpful assistant.",
  "temperature": 0.75,
  "top_k": 50,
  "top_p": 0.9
}

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_Dda**********************************

This is your API token. Keep it to yourself.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run moinnadeem/codellama-34b-instruct-vllm using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "moinnadeem/codellama-34b-instruct-vllm:c97256d55db03f10b73fd058087165a47821eb2f776c851901da46293e952153",
  {
    input: {
      debug: false,
      max_new_tokens: 509,
      min_new_tokens: -1,
      prompt: "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
      system_prompt: "You are a helpful assistant.",
      temperature: 0.75,
      top_k: 50,
      top_p: 0.9
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_Dda**********************************

This is your API token. Keep it to yourself.

Import the client:

import replicate

Run moinnadeem/codellama-34b-instruct-vllm using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "moinnadeem/codellama-34b-instruct-vllm:c97256d55db03f10b73fd058087165a47821eb2f776c851901da46293e952153",
    input={
        "debug": False,
        "max_new_tokens": 509,
        "min_new_tokens": -1,
        "prompt": "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
        "system_prompt": "You are a helpful assistant.",
        "temperature": 0.75,
        "top_k": 50,
        "top_p": 0.9
    }
)

# The moinnadeem/codellama-34b-instruct-vllm model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/moinnadeem/codellama-34b-instruct-vllm/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_Dda**********************************

This is your API token. Keep it to yourself.

Run moinnadeem/codellama-34b-instruct-vllm using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "moinnadeem/codellama-34b-instruct-vllm:c97256d55db03f10b73fd058087165a47821eb2f776c851901da46293e952153",
    "input": {
      "debug": false,
      "max_new_tokens": 509,
      "min_new_tokens": -1,
      "prompt": "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
      "system_prompt": "You are a helpful assistant.",
      "temperature": 0.75,
      "top_k": 50,
      "top_p": 0.9
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

 ```
import requests
from bs4 import BeautifulSoup

def download_h1_text(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    h1_elements = soup.find_all('h1')
    text = [element.text for element in h1_elements]
    return text
```

{
  "id": "dlmoz6lb6vjarv2ydqiecszm2e",
  "model": "moinnadeem/codellama-34b-instruct-vllm",
  "version": "c97256d55db03f10b73fd058087165a47821eb2f776c851901da46293e952153",
  "input": {
    "debug": false,
    "max_new_tokens": 509,
    "min_new_tokens": -1,
    "prompt": "Write a Python function downloads a html page from the internet and extracts the text of the h1 elements",
    "system_prompt": "You are a helpful assistant.",
    "temperature": 0.75,
    "top_k": 50,
    "top_p": 0.9
  },
  "logs": "Your formatted prompt is:\n[INST] <<SYS>>\nYou are a helpful assistant.\n<</SYS>>\nWrite a Python function downloads a html page from the internet and extracts the text of the h1 elements [/INST]\nINFO 09-28 06:31:59 async_llm_engine.py:328] Received request 0: prompt: '[INST] <<SYS>>\\nYou are a helpful assistant.\\n<</SYS>>\\n\\nWrite a Python function downloads a html page from the internet and extracts the text of the h1 elements [/INST]', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.75, top_p=0.9, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=509, logprobs=None), prompt token ids: None.\nINFO 09-28 06:31:59 llm_engine.py:613] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.3%, CPU KV cache usage: 0.0%\nINFO 09-28 06:32:03 async_llm_engine.py:100] Finished request 0.\nGenerated text:  ```\nimport requests\nfrom bs4 import BeautifulSoup\ndef download_h1_text(url):\nresponse = requests.get(url)\nsoup = BeautifulSoup(response.content, 'html.parser')\nh1_elements = soup.find_all('h1')\ntext = [element.text for element in h1_elements]\nreturn text\n```\nhostname: model-c97256d5-c3f3ad7eed055505-gpu-a100-7b6bbcfb6c-hkfd7",
  "output": [
    "",
    " ```",
    "\n",
    "import",
    " requests",
    "\n",
    "from",
    " bs",
    "4",
    " import",
    " Beautiful",
    "S",
    "oup",
    "\n",
    "\n",
    "def",
    " download",
    "_",
    "h",
    "1",
    "_",
    "text",
    "(",
    "url",
    "):",
    "\n",
    "   ",
    " response",
    " =",
    " requests",
    ".",
    "get",
    "(",
    "url",
    ")",
    "\n",
    "   ",
    " soup",
    " =",
    " Beautiful",
    "S",
    "oup",
    "(",
    "response",
    ".",
    "content",
    ",",
    " '",
    "html",
    ".",
    "parser",
    "')",
    "\n",
    "   ",
    " h",
    "1",
    "_",
    "elements",
    " =",
    " soup",
    ".",
    "find",
    "_",
    "all",
    "('",
    "h",
    "1",
    "')",
    "\n",
    "   ",
    " text",
    " =",
    " [",
    "element",
    ".",
    "text",
    " for",
    " element",
    " in",
    " h",
    "1",
    "_",
    "elements",
    "]",
    "\n",
    "   ",
    " return",
    " text",
    "\n",
    "```",
    ""
  ],
  "data_removed": false,
  "error": null,
  "source": "web",
  "status": "succeeded",
  "created_at": "2023-09-28T06:29:49.48363Z",
  "started_at": "2023-09-28T06:31:59.385606Z",
  "completed_at": "2023-09-28T06:32:03.676481Z",
  "urls": {
    "cancel": "https://api.replicate.com/v1/predictions/dlmoz6lb6vjarv2ydqiecszm2e/cancel",
    "get": "https://api.replicate.com/v1/predictions/dlmoz6lb6vjarv2ydqiecszm2e"
  },
  "metrics": {
    "predict_time": 4.290875,
    "total_time": 134.192851
  }
}

Generated in

4.3 seconds

Tweak it Report