joehoover / cog-llongma-2-13b-16k

(Updated 1 year, 10 months ago)

  • Public
  • 18 runs
Iterate in playground

Input

Set the REPLICATE_API_TOKEN environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run joehoover/cog-llongma-2-13b-16k using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "joehoover/cog-llongma-2-13b-16k:ba67b9d08d1cf840584a0339091797429c7b4520ef43f2e07f974d7f60413da4",
    "input": {
      "debug": false,
      "top_k": 250,
      "top_p": 0.95,
      "temperature": 0.95,
      "system_prompt": "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\\n\\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don\'t know the answer to a question, please don\'t share false information.",
      "max_new_tokens": 500,
      "min_new_tokens": -1,
      "repetition_penalty": 1.15,
      "repetition_penalty_sustain": 256,
      "token_repetition_penalty_decay": 128
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

No output yet! Press "Submit" to start a prediction.

Run time and cost

This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.