Changelog
-
API for creating models
November 6, 2023
Replicate’s API now has an endpoint for creating models.
You can use it to automate the creation of models, including fine-tunes of SDXL and Llama 2.
cURL usage
Here’s an example that uses cURL to create a model with a given owner, name, visibility, and hardware:
curl -s -X POST -H "Authorization: Token $REPLICATE_API_TOKEN" \ -d '{"owner": "my-username", "name": "my-new-model", "visibility": "public", "hardware": "gpu-a40-large"}' \ https://api.replicate.com/v1/models
The response is a JSON object of the created model:
{ "url": "https://replicate.com/my-username/my-new-model", "owner": "my-username", "name": "my-new-model", "description": null, "visibility": "public", "github_url": null, "paper_url": null, "license_url": null, "run_count": 0, "cover_image_url": null, "default_example": null, "latest_version": null }
To see all the hardware available for your model to run, consult our endpoint for listing hardware.
curl -s -H "Authorization: Token $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/hardware
[ { "name": "CPU", "sku": "cpu" }, { "name": "Nvidia T4 GPU", "sku": "gpu-t4" }, { "name": "Nvidia A40 GPU", "sku": "gpu-a40-small" }, { "name": "Nvidia A40 (Large) GPU", "sku": "gpu-a40-large" } ]
To compare the price and specifications of these hardware types, check out the pricing page.
JavaScript usage
We’ve added this new operation to the Replicate JavaScript client:
npm install replicate@latest
Then:
import Replicate from "replicate"; const replicate = new Replicate(); // create a new model const model = await replicate.models.create( "my-username", "my-new-model", { visibility: "public", hardware: "gpu-a40-large" } ); console.log(model)
Python usage
We’ve added this new operation to the Replicate Python client:
pip install --upgrade replicate
Then:
import replicate model = replicate.models.create( owner="my-username", name="my-new-model", visibility="public", hardware="gpu-a40-large", ) print(model)
Elixir usage
We’ve added this new operation to the Replicate Elixir client:
mix deps.update replicate
Then:
iex> {:ok, model} = Replicate.Models.create( owner: "your-username", name: "my-model", visibility: "public", hardware: "gpu-a40-large" )
API docs
Check out the HTTP API reference for more detailed documentation about this new endpoint.
-
Improved training detail pages
October 17, 2023
When you kick off a training process to fine-tune your own model, there’s a page you can visit to view the status of the training, as well as the inputs and outputs. We’ve made some recent improvements to those pages:
- The header now shows the base model and destination model you used when fine-tuning.
- Metadata about the training job is now diplayed above the inputs and outputs to make it easier to see the important details without having to scroll down the page.
- We’ve added code snippets you can copy and paste to run your new fine-tuned model.
-
Prediction parameters as JSON
October 9, 2023
You can now view prediction parameters as JSON from the prediction detail page.
This improves the workflow for experimenting in the web interface and then transitioning to making predictions from the API, using code.
-
API for listing public models
October 5, 2023
Replicate’s API now has an endpoint for listing public models.
You can use it to discover newly published models, and to build your own tools for exploring Replicate’s ecosystem of thousands of open-source models.
cURL usage
Here’s an example that uses cURL and jq to fetch the URLs of the 25 most recently updated public models:
curl -s -H "Authorization: Token $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models | jq ".results[].url"
The response is a paginated JSON array of model objects:
{ "next": null, "previous": null, "results": [ { "url": "https://replicate.com/replicate/hello-world", "owner": "replicate", "name": "hello-world", "description": "A tiny model that says hello", "visibility": "public", "github_url": "https://github.com/replicate/cog-examples", "paper_url": null, "license_url": null, "run_count": 5681081, "cover_image_url": "...", "default_example": {...}, "latest_version": {...} } ] }
JavaScript usage
We’ve added this new operation to the Replicate JavaScript client:
npm install replicate@latest
Then:
import Replicate from "replicate"; const replicate = new Replicate(); // get recently published models const latestModels = await replicate.models.list(); console.log(latestModels) // paginate and get all models const allModels = [] for await (const batch of replicate.paginate(replicate.models.list)) { allModels.push(...batch); } console.log(allModels)
Python usage
We’ve added this new operation to the Replicate Python client:
pip install --upgrade replicate
Then:
import replicate models = replicate.models.list() print(models)
Making your own models discoverable
If you’re deploying your own public models to Replicate and want others to be able to discover them, make sure they meet the following criteria:
- The model is public.
- The model has at least one published version.
- The model has at least one example prediction. To add an example, create a prediction using the web interface then click the “Add to examples” button below the prediction output.
-
Deployments
October 3, 2023
You can now create a deployment to get more control over how your models run. Deployments allow you to run a model with a private, fixed API endpoint. You can configure the version of the model, the hardware it runs on, and how it scales.
Using deployments, you can:
- Roll out new versions of your model without having to edit your code.
- Keep instances always on to avoid cold boots.
- Customize what hardware your models run on.
- Monitor whether instances are booting up, running, or processing predictions.
- View predictions that are flowing through your models.
Deployments work with both public models and your own private models.
🚀 Check out the deployments guide to learn more and get started.
-
Prediction query parameter
October 2, 2023
When you create a prediction on the web, we now append a query parameter
?prediction=<uuid>
, so that if you refresh the page, you see that prediction instead of the form prefilled with the default inputs. Previously, if you created a prediction on the web and refreshed, you’d lose the prediction and have to go spelunking for it on your dashboard. -
Fullscreen training logs
October 2, 2023
You can now expand your training logs and view them full-screen.
-
Dynamic status favicons
October 2, 2023
✅ We’ve added a new feature to show the prediction status in the favicon of the browser tab. This makes it easier to know when your running predictions have completed without having to switch tabs.
-
Streaming output for language models
August 14, 2023
Replicate’s API now supports server-sent event (SSE) streams for language models, giving you live output as the model is running. See the announcement blog post and our streaming guide for more details about how to use streaming output.
-
Multiple API tokens for users
August 10, 2023
You can now create multiple personal API tokens at https://replicate.com/account/api-tokens
They’re just like organization tokens, but they’re only for your personal user account. You can name your tokens to make them distinguishable, and reset them if needed.
-
A40 GPUs now available
July 24, 2023
You can now run Replicate models on NVIDIA A40 GPUs. In terms of price and performance, the A40 sits between our T4 and A100 hardware. For many models, the A40 Large can be 80-90% as fast as A100s but almost half the price.
The A40 GPU is currently available in two configurations, each with the same GPU but attached to a machine with different amounts of CPU and RAM:
Hardware Price GPU CPU GPU RAM RAM Nvidia A40 GPU $0.0013 per second ($0.078 per minute) 1x 4x 48GB 16GB Nvidia A40 (Large) GPU $0.0016 per second ($0.096 per minute) 1x 10x 48GB 72GB To choose which GPU type is used to run your model, see the Hardware dropdown on your model’s settings page:
To compare price and performance of all available GPUs, see our pricing page.
-
Hardware and pricing for trainable models
July 21, 2023
Trainable model pages now have a section that indicates training hardware type and cost.
See https://replicate.com/a16z-infra/llama7b-v2-chat#training
-
Training API for language models
July 20, 2023
We built a training API for fine-tuning language models, and today we’re making it available to all Replicate users.
You can fine-tune language models to make them better at a particular task, like classifying text, answering questions about your private data, being a chatbot, or extracting structured data from text.
You can train models like LLaMA 2, Flan T5, GPT-J and others. Check out the trainable language models collection to see what models can be fine-tuned, and stay tuned as we add support for more.
To get started, check out our guide to fine tuning a language model.
training = replicate.trainings.create( version="a16z-infra/llama7b-v2-chat:a845a72bb3fa3ae298143d13efa8873a2987dbf3d49c293513cd8abf4b845a83", input={ "train_data": "https://example.com/my-training-data.jsonl", }, destination="zeke/my-custom-llama-2" )
-
Git commit and tag for model versions
May 19, 2023
You can now see the Git commit and tag used to create new versions of a model. Run
brew upgrade cog
to upgrade to the latest release.The next time you run
cog push
, the current Git commit and tag for your model will automatically be included in the resulting Docker image, and it’ll show up on Replicate under the model page’s Versions tab. -
Downloading outputs
May 16, 2023
We’ve made a few improvements to the download mechanism on the website:
- If you ran a model that produces multiple outputs, you can download them all as a zip file.
- If you’re downloading a single output, it will now actually download instead of linking to the file’s URL.
Happy downloading!
-
Invoice breakdowns
May 16, 2023
You can now view a detailed summary of your invoices with a breakdown of cost, prediction time, and hardware for each model you ran.
Check out your billing settings at https://replicate.com/account/billing
-
Swift client library
May 9, 2023
Replicate now has a client library for Swift. It’s got everything you need to build an AI-powered app for iOS and macOS using Replicate’s HTTP API.
Add it as a package dependency to your project:
let package = Package( // name, platforms, products, etc. dependencies: [ // other dependencies .package(url: "https://github.com/replicate/replicate-swift", from: "0.12.1"), ], targets: [ .target(name: "<target>", dependencies: [ // other dependencies .product(name: "Replicate", package: "replicate-swift"), ]), // other targets ] )
Then, you can run predictions:
import Replicate // Get your API token at https://replicate.com/account private let replicate = Replicate.Client(token: <#token#>) let output = try await replicate.run( "stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf", ["prompt": "a 19th century portrait of a wombat gentleman"] ) print(output) // ["https://replicate.com/api/models/stability-ai/stable-diffusion/files/50fcac81-865d-499e-81ac-49de0cb79264/out-0.png"]
Follow our guide to building a SwiftUI app or read the full documentation on GitHub.
-
Organizations
May 3, 2023
You can now use an organization to collaborate with other people on Replicate.
Organizations let you share access to models, API tokens, billing, dashboards, and more. When you run models as the organization, it gets billed to your shared credit card instead of your personal account.
To get started, use the new account menu to create your organization:
-
Node.js client library
April 4, 2023
Replicate now has a client library for Node.js. You can use it to run models and everything else you can do with the HTTP API.
Install it from npm:
npm install replicate
Then, you can run predictions:
import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN, }); const model = "stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478"; const input = { prompt: "a 19th century portrait of a raccoon gentleman wearing a suit" }; const output = await replicate.run(model, { input }); // ['https://replicate.delivery/pbxt/GtQb3Sgve42ZZyVnt8xjquFk9EX5LP0fF68NTIWlgBMUpguQA/out-0.png']
Follow our guide to running a model from Node.js or read the full documentation on GitHub.
-
More useful metadata from the model API
March 21, 2023
The “get a model” API operation now returns more metadata about the model:
run_count
: an integer indicating how many times the model has been run.default_example
: a prediction object created with this model, and selected by the model owner as an example of the model’s inputs and outputs.cover_image_url
: an HTTPS URL string for an image file. This is an image uploaded by the model author, or an output file or input file from the model’s default example prediction.
Here’s an example using cURL and jq to get the Salesforce blip-2 model as a JSON object and pluck out some of its properties:
curl -s \ -H "Authorization: Token $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ "https://api.replicate.com/v1/models/salesforce/blip-2" \ | jq "{owner, name, run_count, cover_image_url, default_example}"
Here’s what the response looks like:
{ "owner": "salesforce", "name": "blip-2", "run_count": 270306, "cover_image_url": "https://replicate.delivery/pbxt/IJEPmgAlL2zNBNDoRRKFegTEcxnlRhoQxlNjPHSZEy0pSIKn/gg_bridge.jpeg", "default_example": { "completed_at": "2023-02-13T22:26:49.396028Z", "created_at": "2023-02-13T22:26:48.385476Z", "error": null, "id": "uhd4lhedtvdlbnm2cyhzx65zpe", "input": { "image": "https://replicate.delivery/pbxt/IJEPmgAlL2zNBNDoRRKFegTEcxnlRhoQxlNjPHSZEy0pSIKn/gg_bridge.jpeg", "caption": false, "question": "what body of water does this bridge cross?", "temperature": 1 }, "logs": "...", "metrics": { "predict_time": 0.949567 }, "output": "san francisco bay", "started_at": "2023-02-13T22:26:48.446461Z", "status": "succeeded", "version": "4b32258c42e9efd4288bb9910bc532a69727f9acd26aa08e175713a0a857a608", } }
See the “get a model” API docs for more details.
-
Get model input and output schemas via the API
March 20, 2023
Every model on Replicate describes its inputs and outputs with OpenAPI Schema Objects in the
openapi_schema
property. This is a structured JSON object that includes the name, description, type, and allowed values for each input or output parameter.Today we’ve improved our API reference documentation to clarify how to get a model’s input and output schema.
See the updated docs at https://replicate.com/docs/reference/http#models.versions.get
Here’s an example of how to get the input schema for Stable Diffusion using cURL:
curl -s \ -H "Authorization: Token $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ "https://api.replicate.com/v1/models/stability-ai/stable-diffusion/versions/db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf" | jq ".openapi_schema.components.schemas.Input.properties"
Using this command, we can see all the inputs to Stable Diffusion, including their types, description, min and max values, etc:
{ "seed": { "type": "integer", "title": "Seed", "x-order": 7, "description": "Random seed. Leave blank to randomize the seed" }, "prompt": { "type": "string", "title": "Prompt", "default": "a vision of paradise. unreal engine", "x-order": 0, "description": "Input prompt" }, "scheduler": { "allOf": [ { "$ref": "#/components/schemas/scheduler" } ], "default": "DPMSolverMultistep", "x-order": 6, "description": "Choose a scheduler." }, "num_outputs": { "type": "integer", "title": "Num Outputs", "default": 1, "maximum": 4, "minimum": 1, "x-order": 3, "description": "Number of images to output." }, "guidance_scale": { "type": "number", "title": "Guidance Scale", "default": 7.5, "maximum": 20, "minimum": 1, "x-order": 5, "description": "Scale for classifier-free guidance" }, "negative_prompt": { "type": "string", "title": "Negative Prompt", "x-order": 2, "description": "Specify things to not see in the output" }, "image_dimensions": { "allOf": [ { "$ref": "#/components/schemas/image_dimensions" } ], "default": "768x768", "x-order": 1, "description": "pixel dimensions of output image" }, "num_inference_steps": { "type": "integer", "title": "Num Inference Steps", "default": 50, "maximum": 500, "minimum": 1, "x-order": 4, "description": "Number of denoising steps" } }
And here’s a command to the get the output schema:
curl -s \ -H "Authorization: Token $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ "https://api.replicate.com/v1/models/stability-ai/stable-diffusion/versions/db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf" | jq ".openapi_schema.components.schemas.Output"
From this command, we can see that Stable Diffusion output format is a list of URL strings:
{ "type": "array", "items": { "type": "string", "format": "uri" }, "title": "Output" }
-
See more models
February 15, 2023
You can now browse through all the models on Replicate. Check them out on the Explore page!
-
Improved webhook events and event filtering
February 10, 2023
When you create a prediction with the API, you can provide a webhook URL for us to call when your prediction is complete.
Starting today, we now send more webhook events at different stages of the prediction lifecycle. We send requests to your webhook URL whenever there are new logs, new outputs, or the prediction has finished.
You can change which events trigger webhook requests by specifying
webhook_events_filter
in the prediction request.start
: Emitted immediately on prediction start. This event is always sent.output
: Emitted each time a prediction generates an output (note that predictions can generate multiple outputs).logs
: Emitted each time log output is generated by a prediction.completed
: Emitted when the prediction reaches a terminal state (succeeded
/canceled
/failed
). This event is always sent.
For example, if you only wanted requests to be sent at the start and end of the prediction, you would provide:
{ "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "input": { "text": "Alice" }, "webhook": "https://example.com/my-webhook", "webhook_events_filter": ["start", "completed"] }
Requests for event types
output
andlogs
will be sent at most once every 500ms.Requests for event types
start
andcompleted
will always be sent.If you’re using the old
webhook_completed
, you’ll still get the same webhooks as before, but we recommend updating to use the newwebhook
andwebhook_completed
properties.Docs: https://replicate.com/docs/reference/http#create-prediction–webhook
-
Python example code improvements
January 19, 2023
We’ve made it even easier to start building with Replicate’s API. When you click on the API tab for a model, the Python example code now has everything you need to run a prediction, including code for all of the model’s inputs and outputs. These new code snippets include documentation and defaults for each input, so you can focus on coding, with less context switching between the API docs and your editor.
To learn more about how to get started, check out our “Run a model from Python” guide.
-
Cancel long running predictions
January 16, 2023
Have you ever kicked off a prediction, then, after thinking about it, realized you got one of the settings wrong or wanted to tweak the prompt? Well, now you can cancel that prediction, even if you’ve navigated away from the page you were on or made it through the API. On the website, go to your dashboard, find the running prediction, and you’ll now see it live-updating with a handy “Cancel” button.
-
brew install cog
January 10, 2023
🍏 Hey macOS users! There’s now a Homebrew formula for Cog. Use
brew install cog
to install it, andbrew upgrade cog
to upgrade to the latest version. See https://github.com/replicate/cog#install -
Dreambooth support for img2img
January 9, 2023
We’ve added img2img support to models created with our DreamBooth API.
This means you can optionally send both a
prompt
and initialimage
to generate new images (in addition to the other parameters specified in your DreamBooth model’s API page).Input:
prompt
: photo of zeke playing guitar on stage at concertimage
: https://www.pexels.com/photo/man-playing-red-and-black-electric-guitar-on-stage-167382/Output:
To get started building and pushing your own DreamBooth model, check out the blog post.
-
Delete predictions from the web
January 6, 2023
You can now manually delete a prediction on the website. You’ll find a “Delete” button on the prediction detail page, e.g.
https://replicate.com/p/{prediction_id}
. Clicking this link will completely remove the prediction from the site, including any output data and output files associated with it. -
API prediction data no longer stored
January 4, 2023
By popular request, we no longer store data for predictions made using the API.
User data is automatically removed from predictions an hour after they finish. The prediction itself is not deleted, but the
input
andoutput
data for the prediction is removed. This only applies to predictions created with the API, but not predictions created on the website.This is enabled for new accounts starting today, but we know that some users may be relying on prediction data to exist for more than an hour, so we’ve not enabled this for any existing accounts. If you want this enabled for your account, email us at team@replicate.com
-
A proper changelog
December 6, 2022
We now have a changelog for product updates at replicate.com/changelog. We used to use a single tweet thread as our makeshift changelog, but decided it was time to make something a bit more flexible. Stay tuned for more frequent updates here!
-
Stable diffusion has release notes
December 2, 2022
Stable Diffusion now has release notes so you can see what’s changed: replicate.com/stability-ai/stable-diffusion/versions. It only works on Stable Diffusion at the moment. Coming soon to all models so you can set release notes on your own models and see what has changed on other people’s models.
-
Infrastructure improvements
November 4, 2022
Over the past few weeks, we’ve made some major improvements to our infrastructure to make it more reliable and perform better. Nothing’s changed from your point of view, but you’ll be seeing faster response times! 🚀
-
Higher rate limits
November 4, 2022
We’ve also increased our default rate limits. You can create 10 predictions a second, bursting up to 600 predictions a second. https://replicate.com/docs/reference/http#rate-limits
We can support higher rates too – just email us: team@replicate.com
-
Run your own models on Nvidia A100 GPUs
October 19, 2022
You can now run your own models on Nvidia A100s. Click the settings tab on your model and select the hardware option to upgrade. 🚀
-
Set a monthly spend limit
October 17, 2022
You can now set a monthly spend limit on your account to avoid getting a surprising bill. 🦆
To set a limit, visit https://replicate.com/account#limits
-
Webhook support in Predictions API
September 9, 2022
🪝 Our API now supports webhooks, as an alternative to polling. Specify your webhook URL when creating a prediction and we’ll POST to that URL when your prediction has completed! See the API docs here: https://replicate.com/docs/reference/http#create-prediction
-
Introducing model collections
June 16, 2022
“We’ve started curating collections of models that perform similar tasks. First up is an assortment of ✨style transfer✨ models that take a content image and a style reference to produce a new image, like this starry night cat. https://replicate.com/collections/style-transfer
-
Scrubbing support for progressive outputs
March 28, 2022
“When you run a model that changes over time, you can scrub back and forth to see the previous output. We’ve now added that scrubber to predictions in the example gallery, so you can see how they morphed into being: https://replicate.com/pixray/text2image/examples