Official

meta / meta-llama-3.1-405b-instruct

Meta's flagship 405 billion parameter language model, fine-tuned for chat completions

  • Public
  • 5.7M runs
  • Priced per token
  • GitHub
  • License
Iterate in playground
  • Prediction

    meta/meta-llama-3.1-405b-instruct
    ID
    zjq6szssxdrm20cgw0fsntgvbg
    Status
    Succeeded
    Source
    Web
    Total duration
    Created
    by @meta

    Input

    Run meta/meta-llama-3.1-405b-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

    curl -s -X POST \
      -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
      -H "Content-Type: application/json" \
      -H "Prefer: wait" \
      -d $'{
        "input": {
          "top_p": 0.9,
          "prompt": "Tina has one brother and one sister. How many sisters do Tina\'s siblings have?",
          "max_tokens": 1024,
          "min_tokens": 0,
          "temperature": 0.6,
          "system_prompt": "You are a helpful assistant.",
          "presence_penalty": 0,
          "frequency_penalty": 0
        }
      }' \
      https://api.replicate.com/v1/models/meta/meta-llama-3.1-405b-instruct/predictions

    To learn more, take a look at Replicate’s HTTP API reference docs.

    Output

    {
      "completed_at": "2024-07-23T15:04:24.090090Z",
      "created_at": "2024-07-23T15:04:21.227000Z",
      "data_removed": false,
      "error": null,
      "id": "zjq6szssxdrm20cgw0fsntgvbg",
      "input": {
        "top_p": 0.9,
        "prompt": "Tina has one brother and one sister. How many sisters do Tina's siblings have?",
        "max_tokens": 1024,
        "min_tokens": 0,
        "temperature": 0.6,
        "system_prompt": "You are a helpful assistant.",
        "presence_penalty": 0,
        "frequency_penalty": 0
      },
      "logs": "INFO 07-23 15:04:21 async_llm_engine.py:173] Added request 3cd670c9b87b469ea9721ddfbcc6b67a.\nINFO 07-23 15:04:21 metrics.py:396] Avg prompt throughput: 7.8 tokens/s, Avg generation throughput: 17.7 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.3%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:04:24 async_llm_engine.py:140] Finished request 3cd670c9b87b469ea9721ddfbcc6b67a.\nGeneration took 1721746720.08sFormatted prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nTina has one brother and one sister. How many sisters do Tina's siblings have?<|eot_id|><|start_header_id|>assistant<|end_header_id|>",
      "metrics": {
        "batch_size": 1.000229398266064,
        "total_time": 2.86309,
        "input_token_count": 26,
        "tokens_per_second": 29.74134035953616,
        "output_token_count": 84,
        "predict_time_share": 2.8200430870056152,
        "predict_time": 2.826425468,
        "time_to_first_token": 0.112068943
      },
      "output": [
        "T",
        "ina",
        " has",
        " one",
        " brother",
        " and",
        " one",
        " sister",
        ".",
        " From",
        " the",
        " brother",
        "'s",
        " perspective",
        ",",
        " he",
        " has",
        " one",
        " sister",
        ",",
        " Tina",
        ",",
        " and",
        " also",
        " the",
        " other",
        " sister",
        ".",
        " So",
        ",",
        " Tina",
        "'s",
        " brother",
        " has",
        " ",
        "2",
        " sisters",
        ".\n\n",
        "From",
        " the",
        " sister",
        "'s",
        " perspective",
        ",",
        " she",
        " also",
        " has",
        " one",
        " sister",
        ",",
        " Tina",
        ".",
        " \n\n",
        "So",
        ",",
        " Tina",
        "'s",
        " siblings",
        " have",
        " a",
        " total",
        " of",
        " ",
        "2",
        " sisters",
        " (",
        "from",
        " the",
        " brother",
        "'s",
        " perspective",
        ")",
        " and",
        " ",
        "1",
        " sister",
        " (",
        "from",
        " the",
        " sister",
        "'s",
        " perspective",
        ").",
        ""
      ],
      "started_at": "2024-07-23T15:04:21.263665Z",
      "status": "succeeded",
      "urls": {
        "stream": "https://streaming-api.svc.sea.v.replicate.net/v1/streams/6zoqtc6qgpxby7xzauee36qv6bktfaxugelxfqs6ifgg6q4s77sa",
        "get": "https://api.replicate.com/v1/predictions/zjq6szssxdrm20cgw0fsntgvbg",
        "cancel": "https://api.replicate.com/v1/predictions/zjq6szssxdrm20cgw0fsntgvbg/cancel"
      },
      "version": "hidden"
    }
    Generated in
    Input tokens
    26
    Output tokens
    84
    Tokens per second
    29.74 tokens / second
    Time to first token
  • Prediction

    meta/meta-llama-3.1-405b-instruct
    ID
    jzxx22twb9rm20cgw0przfgqzm
    Status
    Succeeded
    Source
    Web
    Total duration
    Created
    by @meta

    Input

    Run meta/meta-llama-3.1-405b-instruct using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

    curl -s -X POST \
      -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
      -H "Content-Type: application/json" \
      -H "Prefer: wait" \
      -d $'{
        "input": {
          "top_p": 0.9,
          "prompt": "Invent a stack-based Turing-complete programming language with minimal syntax, write a PLY parser for it, a language runtime, and show some examples.",
          "max_tokens": 1024,
          "min_tokens": 0,
          "temperature": 0.6,
          "system_prompt": "You are an expert programmer, talking to another expert programmer. Be terse.",
          "presence_penalty": 0,
          "frequency_penalty": 0
        }
      }' \
      https://api.replicate.com/v1/models/meta/meta-llama-3.1-405b-instruct/predictions

    To learn more, take a look at Replicate’s HTTP API reference docs.

    Output

    {
      "completed_at": "2024-07-23T15:20:12.568195Z",
      "created_at": "2024-07-23T15:19:47.546000Z",
      "data_removed": false,
      "error": null,
      "id": "jzxx22twb9rm20cgw0przfgqzm",
      "input": {
        "top_p": 0.9,
        "prompt": "Invent a stack-based Turing-complete programming language with minimal syntax, write a PLY parser for it, a language runtime, and show some examples.",
        "max_tokens": 1024,
        "min_tokens": 0,
        "temperature": 0.6,
        "system_prompt": "You are an expert programmer, talking to another expert programmer. Be terse.",
        "presence_penalty": 0,
        "frequency_penalty": 0
      },
      "logs": "INFO 07-23 15:19:47 async_llm_engine.py:173] Added request 997e070bd8ea42f798e91bc890de6c56.\nINFO 07-23 15:19:47 metrics.py:396] Avg prompt throughput: 0.2 tokens/s, Avg generation throughput: 0.3 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.3%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:19:52 metrics.py:396] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 30.1 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 1.0%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:19:57 metrics.py:396] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 30.1 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 1.7%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:20:02 metrics.py:396] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 30.1 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 2.4%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:20:07 metrics.py:396] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 30.1 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 3.1%, CPU KV cache usage: 0.0%.\nINFO 07-23 15:20:12 async_llm_engine.py:140] Finished request 997e070bd8ea42f798e91bc890de6c56.\nGeneration took 1721745335.56sFormatted prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are an expert programmer, talking to another expert programmer. Be terse.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nInvent a stack-based Turing-complete programming language with minimal syntax, write a PLY parser for it, a language runtime, and show some examples.<|eot_id|><|start_header_id|>assistant<|end_header_id|>",
      "metrics": {
        "batch_size": 2.000051610448683,
        "total_time": 25.022195,
        "input_token_count": 45,
        "tokens_per_second": 30.060659635637204,
        "output_token_count": 751,
        "predict_time_share": 12.490856170654297,
        "predict_time": 24.985711232,
        "time_to_first_token": 0.112341854
      },
      "output": [
        "**",
        "Language",
        " Definition",
        "**\n\n",
        "Name",
        ":",
        " `",
        "ST",
        "K",
        "`\n\n",
        "*",
        " `",
        "push",
        "`",
        " instruction",
        ":",
        " `",
        "p",
        "`\n",
        "*",
        " `",
        "pop",
        "`",
        " instruction",
        ":",
        " `",
        "o",
        "`\n",
        "*",
        " `",
        "add",
        "`",
        " instruction",
        ":",
        " `",
        "+",
        "`\n",
        "*",
        " `",
        "sub",
        "`",
        " instruction",
        ":",
        " `-",
        "`\n",
        "*",
        " `",
        "mul",
        "`",
        " instruction",
        ":",
        " `",
        "*",
        "`\n",
        "*",
        " `",
        "div",
        "`",
        " instruction",
        ":",
        " `/",
        "`\n",
        "*",
        " `",
        "jump",
        "`",
        " instruction",
        ":",
        " `",
        "j",
        "`\n",
        "*",
        " `",
        "halt",
        "`",
        " instruction",
        ":",
        " `",
        "h",
        "`\n\n",
        "**",
        "PLY",
        " Parser",
        "**\n\n",
        "```",
        "python",
        "\n",
        "import",
        " ply",
        ".lex",
        " as",
        " lex",
        "\n",
        "import",
        " ply",
        ".y",
        "acc",
        " as",
        " yacc",
        "\n\n",
        "#",
        " Lexer",
        "\n",
        "tokens",
        " =",
        " ['",
        "P",
        "USH",
        "',",
        " '",
        "POP",
        "',",
        " '",
        "ADD",
        "',",
        " '",
        "SUB",
        "',",
        " '",
        "M",
        "UL",
        "',",
        " '",
        "DIV",
        "',",
        " '",
        "J",
        "UMP",
        "',",
        " '",
        "HAL",
        "T",
        "',",
        " '",
        "NUMBER",
        "']\n\n",
        "t",
        "_PUSH",
        " =",
        " r",
        "'",
        "p",
        "'\n",
        "t",
        "_POP",
        " =",
        " r",
        "'o",
        "'\n",
        "t",
        "_ADD",
        " =",
        " r",
        "'\\",
        "+",
        "'\n",
        "t",
        "_SUB",
        " =",
        " r",
        "'-",
        "'\n",
        "t",
        "_MUL",
        " =",
        " r",
        "'\\",
        "*",
        "'\n",
        "t",
        "_DIV",
        " =",
        " r",
        "'/",
        "'\n",
        "t",
        "_JUMP",
        " =",
        " r",
        "'",
        "j",
        "'\n",
        "t",
        "_H",
        "ALT",
        " =",
        " r",
        "'h",
        "'\n",
        "t",
        "_NUMBER",
        " =",
        " r",
        "'\\",
        "d",
        "+",
        "'\n\n",
        "#",
        " Parser",
        "\n",
        "def",
        " p",
        "_program",
        "(p",
        "):\n",
        "   ",
        " '''",
        "program",
        " :",
        " instruction",
        "_list",
        "'''\n",
        "   ",
        " p",
        "[",
        "0",
        "]",
        " =",
        " p",
        "[",
        "1",
        "]\n\n",
        "def",
        " p",
        "_instruction",
        "_list",
        "(p",
        "):\n",
        "   ",
        " '''",
        "instruction",
        "_list",
        " :",
        " instruction",
        " instruction",
        "_list",
        "\n",
        "                      ",
        " |",
        " instruction",
        "'''\n",
        "   ",
        " if",
        " len",
        "(p",
        ")",
        " ==",
        " ",
        "3",
        ":\n",
        "       ",
        " p",
        "[",
        "0",
        "]",
        " =",
        " [",
        "p",
        "[",
        "1",
        "]]",
        " +",
        " p",
        "[",
        "2",
        "]\n",
        "   ",
        " else",
        ":\n",
        "       ",
        " p",
        "[",
        "0",
        "]",
        " =",
        " [",
        "p",
        "[",
        "1",
        "]]\n\n",
        "def",
        " p",
        "_instruction",
        "(p",
        "):\n",
        "   ",
        " '''",
        "instruction",
        " :",
        " PUSH",
        " NUMBER",
        "\n",
        "                  ",
        " |",
        " POP",
        "\n",
        "                  ",
        " |",
        " ADD",
        "\n",
        "                  ",
        " |",
        " SUB",
        "\n",
        "                  ",
        " |",
        " M",
        "UL",
        "\n",
        "                  ",
        " |",
        " DIV",
        "\n",
        "                  ",
        " |",
        " J",
        "UMP",
        " NUMBER",
        "\n",
        "                  ",
        " |",
        " HAL",
        "T",
        "'''\n",
        "   ",
        " if",
        " len",
        "(p",
        ")",
        " ==",
        " ",
        "3",
        ":\n",
        "       ",
        " p",
        "[",
        "0",
        "]",
        " =",
        " (",
        "p",
        "[",
        "1",
        "],",
        " int",
        "(p",
        "[",
        "2",
        "]))\n",
        "   ",
        " else",
        ":\n",
        "       ",
        " p",
        "[",
        "0",
        "]",
        " =",
        " (",
        "p",
        "[",
        "1",
        "],",
        ")\n\n",
        "#",
        " Build",
        " parser",
        "\n",
        "lexer",
        " =",
        " lex",
        ".lex",
        "()\n",
        "parser",
        " =",
        " yacc",
        ".y",
        "acc",
        "()\n",
        "``",
        "`\n\n",
        "**",
        "Language",
        " Runtime",
        "**\n\n",
        "```",
        "python",
        "\n",
        "class",
        " ST",
        "K",
        "Runtime",
        ":\n",
        "   ",
        " def",
        " __",
        "init",
        "__(",
        "self",
        ",",
        " program",
        "):\n",
        "       ",
        " self",
        ".program",
        " =",
        " program",
        "\n",
        "       ",
        " self",
        ".pc",
        " =",
        " ",
        "0",
        "\n",
        "       ",
        " self",
        ".stack",
        " =",
        " []\n\n",
        "   ",
        " def",
        " run",
        "(self",
        "):\n",
        "       ",
        " while",
        " self",
        ".pc",
        " <",
        " len",
        "(self",
        ".program",
        "):\n",
        "           ",
        " op",
        ",",
        " *",
        "args",
        " =",
        " self",
        ".program",
        "[self",
        ".pc",
        "]\n",
        "           ",
        " self",
        ".pc",
        " +=",
        " ",
        "1",
        "\n",
        "           ",
        " if",
        " op",
        " ==",
        " '",
        "p",
        "':\n",
        "               ",
        " self",
        ".stack",
        ".append",
        "(args",
        "[",
        "0",
        "])\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '",
        "o",
        "':\n",
        "               ",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '+",
        "':\n",
        "               ",
        " b",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " a",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " self",
        ".stack",
        ".append",
        "(a",
        " +",
        " b",
        ")\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '-",
        "':\n",
        "               ",
        " b",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " a",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " self",
        ".stack",
        ".append",
        "(a",
        " -",
        " b",
        ")\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '*",
        "':\n",
        "               ",
        " b",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " a",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " self",
        ".stack",
        ".append",
        "(a",
        " *",
        " b",
        ")\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '/",
        "':\n",
        "               ",
        " b",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " a",
        " =",
        " self",
        ".stack",
        ".pop",
        "()\n",
        "               ",
        " self",
        ".stack",
        ".append",
        "(a",
        " //",
        " b",
        ")\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '",
        "j",
        "':\n",
        "               ",
        " self",
        ".pc",
        " =",
        " args",
        "[",
        "0",
        "]\n",
        "           ",
        " elif",
        " op",
        " ==",
        " '",
        "h",
        "':\n",
        "               ",
        " break",
        "\n\n",
        "   ",
        " def",
        " get",
        "_result",
        "(self",
        "):\n",
        "       ",
        " return",
        " self",
        ".stack",
        "[-",
        "1",
        "]",
        " if",
        " self",
        ".stack",
        " else",
        " None",
        "\n\n",
        "#",
        " Example",
        " usage",
        "\n",
        "program",
        " =",
        " parser",
        ".parse",
        "('",
        "p",
        " ",
        "2",
        " p",
        " ",
        "3",
        " +",
        "')\n",
        "runtime",
        " =",
        " ST",
        "K",
        "Runtime",
        "(program",
        ")\n",
        "runtime",
        ".run",
        "()\n",
        "print",
        "(runtime",
        ".get",
        "_result",
        "())",
        " ",
        " #",
        " Output",
        ":",
        " ",
        "5",
        "\n",
        "``",
        "`\n\n",
        "**",
        "More",
        " Examples",
        "**\n\n",
        "*",
        " Fibonacci",
        " sequence",
        ":",
        " `",
        "p",
        " ",
        "0",
        " p",
        " ",
        "1",
        " p",
        " ",
        "10",
        " j",
        " ",
        "5",
        " h",
        " ",
        "5",
        " o",
        " o",
        " +",
        " p",
        " ",
        "1",
        " j",
        " ",
        "3",
        " h",
        " ",
        "3",
        " o",
        " o",
        " p",
        " ",
        "1",
        " +",
        " j",
        " ",
        "5",
        "`\n",
        "*",
        " Factor",
        "ial",
        ":",
        " `",
        "p",
        " ",
        "5",
        " p",
        " ",
        "1",
        " j",
        " ",
        "3",
        " h",
        " ",
        "3",
        " o",
        " o",
        " *",
        " p",
        " ",
        "1",
        " +",
        " j",
        " ",
        "3",
        "`",
        ""
      ],
      "started_at": "2024-07-23T15:19:47.582483Z",
      "status": "succeeded",
      "urls": {
        "stream": "https://streaming-api.svc.sea.v.replicate.net/v1/streams/oo3jqmezolvmqhrq4nxquskhrt5ob6hugjjheyrhbuldkejk3rpq",
        "get": "https://api.replicate.com/v1/predictions/jzxx22twb9rm20cgw0przfgqzm",
        "cancel": "https://api.replicate.com/v1/predictions/jzxx22twb9rm20cgw0przfgqzm/cancel"
      },
      "version": "hidden"
    }
    Generated in
    Input tokens
    45
    Output tokens
    750
    Tokens per second
    30.06 tokens / second
    Time to first token

Want to make some of these yourself?

Run this model