technillogue/mistral-instruct-webrtc-triton

Public
13 runs

Input

Set the REPLICATE_API_TOKEN environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run technillogue/mistral-instruct-webrtc-triton using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "technillogue/mistral-instruct-webrtc-triton:69d3e23a7548431723ac296c3a1d18ae8d1d7406322d2232e5e5e9c91f6a5cc6",
    "input": {
      "top_k": 0,
      "top_p": 0,
      "temperature": 1,
      "system_prompt": "You are a very helpful, respectful and honest assistant.",
      "length_penalty": 1,
      "max_new_tokens": 250,
      "prompt_template": "<s>[INST] {system_prompt} {prompt} [/INST]",
      "presence_penalty": 0,
      "frequency_penalty": 0
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

No output yet! Press "Submit" to start a prediction.

Run time and cost

This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

This model doesn't have a readme.