nomagick/chatglm2-6b-int4:ea3715b3 – Run with an API on Replicate

Version

You're looking at a specific version of this model. Jump to the model overview.

nomagick /chatglm2-6b-int4:ea3715b3

Input

prompt

string

Shift + Return to add a new line

[Round 1]

问：请使用英文重复这段话："为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。"

答：[Round 1]

问：请使用英文重复这段话："为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。"

答：

Prompt for completion

Default: "[Round 1]\n\n问：请使用英文重复这段话：\"为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。\"\n\n答："

max_tokens

integer

(minimum: 1, maximum: 32768)

Max new tokens to generate

Default: 2048

temperature

number

(minimum: 0, maximum: 5)

Temperature

Default: 0.75

top_p

number

(minimum: 0, maximum: 1)

Top_p

Default: 0.8

Run this model in Node.js with one line of code:

npx create-replicate --model=nomagick/chatglm2-6b-int4

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run nomagick/chatglm2-6b-int4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "nomagick/chatglm2-6b-int4:ea3715b3c4561f1e7c2a7db5873cf9831a7a6c56a6910f7276d17e56b08ef4a9",
  {
    input: {
      top_p: 0.8,
      prompt: "[Round 1]\n\n问：请使用英文重复这段话：\"为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。\"\n\n答：",
      max_tokens: 2048,
      temperature: 0.75
    }
  }
);
console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run nomagick/chatglm2-6b-int4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "nomagick/chatglm2-6b-int4:ea3715b3c4561f1e7c2a7db5873cf9831a7a6c56a6910f7276d17e56b08ef4a9",
    input={
        "top_p": 0.8,
        "prompt": "[Round 1]\n\n问：请使用英文重复这段话：\"为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。\"\n\n答：",
        "max_tokens": 2048,
        "temperature": 0.75
    }
)

# The nomagick/chatglm2-6b-int4 model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/nomagick/chatglm2-6b-int4/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run nomagick/chatglm2-6b-int4 using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "ea3715b3c4561f1e7c2a7db5873cf9831a7a6c56a6910f7276d17e56b08ef4a9",
    "input": {
      "top_p": 0.8,
      "prompt": "[Round 1]\\n\\n问：请使用英文重复这段话：\\"为了使模型生成最优输出，当使用 ChatGLM2-6B 时需要使用特定的输入格式，请按照示例格式组织输入。\\"\\n\\n答：",
      "max_tokens": 2048,
      "temperature": 0.75
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

To achieve the best output from the model, when using ChatGLM2-6B, specific input formatting is required. Please organize the input according to the example format.

Generated in

6.1 seconds

Tweak it Report