Readme
This model doesn't have a readme.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run technillogue/mistral-instruct-webrtc-triton using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"technillogue/mistral-instruct-webrtc-triton:69d3e23a7548431723ac296c3a1d18ae8d1d7406322d2232e5e5e9c91f6a5cc6",
input={
"top_k": 0,
"top_p": 0,
"temperature": 1,
"system_prompt": "You are a very helpful, respectful and honest assistant.",
"length_penalty": 1,
"max_new_tokens": 250,
"prompt_template": "<s>[INST] {system_prompt} {prompt} [/INST]",
"presence_penalty": 0,
"frequency_penalty": 0
}
)
# The technillogue/mistral-instruct-webrtc-triton model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
# https://replicate.com/technillogue/mistral-instruct-webrtc-triton/api#output-schema
print(item, end="")
To learn more, take a look at the guide on getting started with Python.
No output yet! Press "Submit" to start a prediction.
This model runs on Nvidia L40S GPU hardware. We don't yet have enough runs of this model to provide performance information.
This model doesn't have a readme.