How to run Yi chat models with an API
Posted by @nateraw
The Yi series models are large language models trained from scratch by developers at 01.AI. Today, they’ve released two new models: Yi-6B-Chat and Yi-34B-Chat. These models extend the base models, Yi-6B and Yi-34B, and are fine-tuned for chat completion.
Yi-34B currently holds the state-of-the-art for most benchmarks, beating larger models like Llama-70B.
How to run Yi-34B-Chat with an API
Yi-34B-Chat is on Replicate and you can run it in the cloud with a few lines of code.
You can run it with our JavaScript client:
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
const output = await replicate.run(
"01-ai/yi-34b-chat:914692bbe8a8e2b91a4e44203e70d170c9c5ccc1359b283c84b0ec8d47819a46",
{
input: {
prompt:
"Write a poem about Parmigiano Reggiano.",
},
}
);
Or, our Python client:
import replicate
output = replicate.run(
"01-ai/yi-34b-chat:914692bbe8a8e2b91a4e44203e70d170c9c5ccc1359b283c84b0ec8d47819a46",
input={"prompt": "Write a poem about Parmigiano Reggiano."}
)
# The 01-ai/yi-6b-chat model can stream output as it's running.
# The predict method returns an iterator, and you can iterate over that output.
for item in output:
print(item, end="")
Or, you can call the HTTP API directly with tools like cURL:
curl -s -X POST \
-d '{"version": "914692bbe8a8e2b91a4e44203e70d170c9c5ccc1359b283c84b0ec8d47819a46", "input": {"prompt": "Write a poem..."}}' \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
"https://api.replicate.com/v1/predictions"
You can also run Yi chat models using other client libraries for Go, Swift, Elixir, and others.
Next steps
- Take a look at 01-ai/yi-6b-chat and 01-ai/yi-34b-chat on Replicate.
- Read the model cards for Yi-6B-Chat and Yi-34B-Chat.
- Check out the Yi GitHub repo.