Readme
This model doesn't have a readme.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run technillogue/llama-2-7b-mlc using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"technillogue/llama-2-7b-mlc:16bc60b8feaa78f73c5c82d3286fe61f7270b0d7305954bc7bb7fe28a811f9cd",
input={
"ice_servers": "[{\"urls\":\"stun:stun.l.google.com:19302\"}]"
}
)
# The technillogue/llama-2-7b-mlc model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
# https://replicate.com/technillogue/llama-2-7b-mlc/api#output-schema
print(item)
To learn more, take a look at the guide on getting started with Python.
No output yet! Press "Submit" to start a prediction.
This model runs on Nvidia A100 (80GB) GPU hardware. We don't yet have enough runs of this model to provide performance information.
This model doesn't have a readme.