Prediction

johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a

Model

johnnyoshika/llama2-combine-numbers:3d318c90

ID

27dqpzg7pnrgm0cfhsh9rxr30w

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

2.5s

Created

about 1 year ago

Input

debug
top_p: 0.95
prompt: What is 10+4?
temperature: 0.7
return_logits
max_new_tokens: 128
min_new_tokens: -1
repetition_penalty: 1.15

{
  "debug": false,
  "top_p": 0.95,
  "prompt": "What is 10+4?",
  "temperature": 0.7,
  "return_logits": false,
  "max_new_tokens": 128,
  "min_new_tokens": -1,
  "repetition_penalty": 1.15
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
  {
    input: {
      debug: false,
      top_p: 0.95,
      prompt: "What is 10+4?",
      temperature: 0.7,
      return_logits: false,
      max_new_tokens: 128,
      min_new_tokens: -1,
      repetition_penalty: 1.15
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    input={
        "debug": False,
        "top_p": 0.95,
        "prompt": "What is 10+4?",
        "temperature": 0.7,
        "return_logits": False,
        "max_new_tokens": 128,
        "min_new_tokens": -1,
        "repetition_penalty": 1.15
    }
)

# The johnnyoshika/llama2-combine-numbers model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/johnnyoshika/llama2-combine-numbers/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    "input": {
      "debug": false,
      "top_p": 0.95,
      "prompt": "What is 10+4?",
      "temperature": 0.7,
      "return_logits": false,
      "max_new_tokens": 128,
      "min_new_tokens": -1,
      "repetition_penalty": 1.15
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

10 + 4 = 14

Generated in

0.3 seconds

Tweak it Share Report

Prediction

johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a

Model

johnnyoshika/llama2-combine-numbers:3d318c90

ID

8vvv4hchj1rgg0cfhsh9bjcf1r

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

0.2s

Created

about 1 year ago

Input

debug
top_p: 0.95
prompt: What is 398 + 34?
temperature: 0.7
return_logits
max_new_tokens: 128
min_new_tokens: -1
repetition_penalty: 1.15

{
  "debug": false,
  "top_p": 0.95,
  "prompt": "What is 398 + 34?",
  "temperature": 0.7,
  "return_logits": false,
  "max_new_tokens": 128,
  "min_new_tokens": -1,
  "repetition_penalty": 1.15
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
  {
    input: {
      debug: false,
      top_p: 0.95,
      prompt: "What is 398 + 34?",
      temperature: 0.7,
      return_logits: false,
      max_new_tokens: 128,
      min_new_tokens: -1,
      repetition_penalty: 1.15
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    input={
        "debug": False,
        "top_p": 0.95,
        "prompt": "What is 398 + 34?",
        "temperature": 0.7,
        "return_logits": False,
        "max_new_tokens": 128,
        "min_new_tokens": -1,
        "repetition_penalty": 1.15
    }
)

# The johnnyoshika/llama2-combine-numbers model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/johnnyoshika/llama2-combine-numbers/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    "input": {
      "debug": false,
      "top_p": 0.95,
      "prompt": "What is 398 + 34?",
      "temperature": 0.7,
      "return_logits": false,
      "max_new_tokens": 128,
      "min_new_tokens": -1,
      "repetition_penalty": 1.15
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

39834

Generated in

0.2 seconds

Tweak it Share Report

Prediction

johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a

Model

johnnyoshika/llama2-combine-numbers:3d318c90

ID

pafsd7rxq9rgp0cfhshrgqct10

Status

Succeeded

Source

Web

Hardware

A40 (Large)

Total duration

0.4s

Created

about 1 year ago

Input

debug
top_p: 0.95
prompt: What is 12+98?
temperature: 0.7
return_logits
max_new_tokens: 128
min_new_tokens: -1
repetition_penalty: 1.15

{
  "debug": false,
  "top_p": 0.95,
  "prompt": "What is 12+98?",
  "temperature": 0.7,
  "return_logits": false,
  "max_new_tokens": 128,
  "min_new_tokens": -1,
  "repetition_penalty": 1.15
}

Install Replicate’s Node.js client library:

npm install replicate

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
  {
    input: {
      debug: false,
      top_p: 0.95,
      prompt: "What is 12+98?",
      temperature: 0.7,
      return_logits: false,
      max_new_tokens: 128,
      min_new_tokens: -1,
      repetition_penalty: 1.15
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Import the client:

import replicate

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    input={
        "debug": False,
        "top_p": 0.95,
        "prompt": "What is 12+98?",
        "temperature": 0.7,
        "return_logits": False,
        "max_new_tokens": 128,
        "min_new_tokens": -1,
        "repetition_penalty": 1.15
    }
)

# The johnnyoshika/llama2-combine-numbers model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/johnnyoshika/llama2-combine-numbers/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Run johnnyoshika/llama2-combine-numbers using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "johnnyoshika/llama2-combine-numbers:3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a",
    "input": {
      "debug": false,
      "top_p": 0.95,
      "prompt": "What is 12+98?",
      "temperature": 0.7,
      "return_logits": false,
      "max_new_tokens": 128,
      "min_new_tokens": -1,
      "repetition_penalty": 1.15
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

1298 = 12 + (98 * 1)

{
  "completed_at": "2024-05-19T01:06:51.885516Z",
  "created_at": "2024-05-19T01:06:51.450000Z",
  "data_removed": false,
  "error": null,
  "id": "pafsd7rxq9rgp0cfhshrgqct10",
  "input": {
    "debug": false,
    "top_p": 0.95,
    "prompt": "What is 12+98?",
    "temperature": 0.7,
    "return_logits": false,
    "max_new_tokens": 128,
    "min_new_tokens": -1,
    "repetition_penalty": 1.15
  },
  "logs": "Your formatted prompt is:\nWhat is 12+98?\ncorrect lora is already loaded\nOverall initialize_peft took 0.000\nExllama: False\nINFO 05-19 01:06:51 async_llm_engine.py:371] Received request 0: prompt: 'What is 12+98?', sampling params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=1.0, temperature=0.7, top_p=0.95, top_k=50, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['</s>'], ignore_eos=False, max_tokens=128, logprobs=None, skip_special_tokens=True), prompt token ids: None.\nINFO 05-19 01:06:51 llm_engine.py:631] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%\nINFO 05-19 01:06:51 async_llm_engine.py:111] Finished request 0.\nhostname: model-hp-77dde5d6c56598691b9008f7d123a18d-74856449d8-4m969",
  "metrics": {
    "predict_time": 0.429176,
    "total_time": 0.435516
  },
  "output": [
    "\n",
    "1",
    "2",
    "9",
    "8",
    " =",
    " ",
    "1",
    "2",
    " +",
    " (",
    "9",
    "8",
    " *",
    " ",
    "1",
    ")",
    ""
  ],
  "started_at": "2024-05-19T01:06:51.456340Z",
  "status": "succeeded",
  "urls": {
    "stream": "https://streaming-api.svc.us.c.replicate.net/v1/streams/t74g32xoqbenh7nqulppfyslblazguycdg4aek4g23axn642fvfq",
    "get": "https://api.replicate.com/v1/predictions/pafsd7rxq9rgp0cfhshrgqct10",
    "cancel": "https://api.replicate.com/v1/predictions/pafsd7rxq9rgp0cfhshrgqct10/cancel"
  },
  "version": "3d318c904899fa396a3255078da6a56c0d4f0b7837550159f196eb05932aae0a"
}

Generated in

0.4 seconds

Tweak it Share Report

johnnyoshika / llama2-combine-numbers

Prediction

Input

Output

Prediction

Input

Output

Prediction

Input

Output

Logs (27dqpzg7pnrgm0cfhsh9rxr30w)

Logs (8vvv4hchj1rgg0cfhsh9bjcf1r)

Logs (pafsd7rxq9rgp0cfhshrgqct10)