Readme
This model doesn't have a readme.
test image please ignore
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run yorickvp/cog-vllm-build using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"yorickvp/cog-vllm-build:2110a4e63c0c8d06088696dbda48813635204719185fac492df1e8717d5e1e07",
input={
"top_k": 50,
"top_p": 0.9,
"prompt": "",
"max_tokens": 512,
"min_tokens": 0,
"temperature": 0.6,
"system_prompt": "You are a helpful assistant.",
"presence_penalty": 0,
"frequency_penalty": 0
}
)
# The yorickvp/cog-vllm-build model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
# https://replicate.com/yorickvp/cog-vllm-build/api#output-schema
print(item, end="")
To learn more, take a look at the guide on getting started with Python.
No output yet! Press "Submit" to start a prediction.
This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.
This model doesn't have a readme.