deniyes/dolly-v2-12b-demo – Run with an API on Replicate

deniyes / dolly-v2-12b-demo

dolly-v2-12b， just for testing

Cold

Public
17 runs
A100 (80GB)

Iterate in playground

Run with an API

Playground API Examples README Versions

Input

prompt

*string

Shift + Return to add a new line

Input Prompt.

max_length

integer

(minimum: 1)

Maximum number of tokens to generate. A word is generally 2-3 tokens

Default: 500

decoding

string

Choose a decoding method

Default: "top_p"

top_k

integer

Valid if you choose top_k decoding. The number of highest probability vocabulary tokens to keep for top-k-filtering

Default: 50

top_p

number

(minimum: 0.01, maximum: 1)

Valid if you choose top_p decoding. When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens

Default: 1

temperature

number

(minimum: 0.01, maximum: 5)

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.

Default: 0.75

repetition_penalty

number

(minimum: 0.01, maximum: 5)

Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Default: 1.2

Run this model in Node.js with one line of code:

npx create-replicate --model=deniyes/dolly-v2-12b-demo

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run deniyes/dolly-v2-12b-demo using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "deniyes/dolly-v2-12b-demo:ef548bcbf14a2dc42292c647523630085bdb7e4a65a8e405237fccdc03e4cbda",
  {
    input: {
      top_k: 50,
      top_p: 1,
      prompt: "please compare the Cog and Blentoml",
      decoding: "top_p",
      max_length: 500,
      temperature: 0.75,
      repetition_penalty: 1.2
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run deniyes/dolly-v2-12b-demo using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "deniyes/dolly-v2-12b-demo:ef548bcbf14a2dc42292c647523630085bdb7e4a65a8e405237fccdc03e4cbda",
    input={
        "top_k": 50,
        "top_p": 1,
        "prompt": "please compare the Cog and Blentoml",
        "decoding": "top_p",
        "max_length": 500,
        "temperature": 0.75,
        "repetition_penalty": 1.2
    }
)

# The deniyes/dolly-v2-12b-demo model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
    # https://replicate.com/deniyes/dolly-v2-12b-demo/api#output-schema
    print(item, end="")

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run deniyes/dolly-v2-12b-demo using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "deniyes/dolly-v2-12b-demo:ef548bcbf14a2dc42292c647523630085bdb7e4a65a8e405237fccdc03e4cbda",
    "input": {
      "top_k": 50,
      "top_p": 1,
      "prompt": "please compare the Cog and Blentoml",
      "decoding": "top_p",
      "max_length": 500,
      "temperature": 0.75,
      "repetition_penalty": 1.2
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

Cog - A computer program that allows you to create a virtual assistant. Blentoml - An open-source language model platform for building conversational AI applications.

Generated in

1.8 seconds

Tweak it Share Report

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.