Prediction

Official model

meta/llama-4-maverick-instruct

8vn5mz0xbsrme0cp10vv3zd150

Status

Succeeded

Source

Web

Total duration

0.6s

Created

9 months ago by meta

Webhook

–

Input

prompt: Hello, Llama!
max_tokens: 1024
temperature: 0.6
presence_penalty: 0
frequency_penalty: 0
top_p: 1

{
  "frequency_penalty": 0,
  "max_tokens": 1024,
  "presence_penalty": 0,
  "prompt": "Hello, Llama!",
  "temperature": 0.6,
  "top_p": 1
}

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_f2g**********************************

This is your API token. Keep it to yourself.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run meta/llama-4-maverick-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const input = {
  frequency_penalty: 0,
  max_tokens: 1024,
  presence_penalty: 0,
  prompt: "Hello, Llama!",
  temperature: 0.6,
  top_p: 1
};

for await (const event of replicate.stream("meta/llama-4-maverick-instruct", { input })) {
  process.stdout.write(event.toString());
};

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_f2g**********************************

This is your API token. Keep it to yourself.

Import the client:

import replicate

Run meta/llama-4-maverick-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

# The meta/llama-4-maverick-instruct model can stream output as it's running.
for event in replicate.stream(
    "meta/llama-4-maverick-instruct",
    input={
        "frequency_penalty": 0,
        "max_tokens": 1024,
        "presence_penalty": 0,
        "prompt": "Hello, Llama!",
        "temperature": 0.6,
        "top_p": 1
    },
):
    print(str(event), end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=r8_f2g**********************************

This is your API token. Keep it to yourself.

Run meta/llama-4-maverick-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "input": {
      "frequency_penalty": 0,
      "max_tokens": 1024,
      "presence_penalty": 0,
      "prompt": "Hello, Llama!",
      "temperature": 0.6,
      "top_p": 1
    }
  }' \
  https://api.replicate.com/v1/models/meta/llama-4-maverick-instruct/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

Hello! It's nice to meet you. Is there something I can help you with or would you like to chat?

Generated in

0.6 seconds

Input tokens

Output tokens

Tokens per second

40.50 tokens / second

Time to first token

8 milliseconds

Tweak it Iterate in playground Report