Docs

Learn how to package a trained model using Cog and publish it to Replicate.

Prerequisites

  • A trained model. If you don't already have your own trained model, you can use one from replicate/cog-examples.
  • macOS or Linux. You'll be using the Cog command-line tool to build and push your model. Cog works on macOS and Linux, but does not currently support Windows.
  • Docker. Cog uses Docker to create a container for your model. You'll need to install Docker before you can run Cog.

Create a Replicate account

Before you can publish your model, you'll need a Replicate account. Replicate is currently in closed beta while we iron out the wrinkles. If you have a beta invite link you can use that to sign up. If you don't have a beta invite, come talk to us in Discord or send an email to team@replicate.com.

Create a model page

Next you'll create a page for your model on Replicate. Visit replicate.com/create to choose a name for your model, and specify whether it should be public or private.

Install Cog

Cog is an open source tool that makes it easy to put a machine learning model in a Docker container. Run this to install it:

sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog

More information about Cog and its full documentation are on GitHub.

Initialize Cog

To configure your project for use with Cog, you'll need to add two files:

Use the cog init command to generate these files in your project:

cd path/to/your/model
cog init

Define your dependencies

The cog.yaml file defines all the different things that need to be installed for your model to run. You can think of it as a simple way of defining a Docker image.

For example:

build:
  python_version: "3.8"
  python_packages:
    - "torch==1.7.0"

This will generate a Docker image with Python 3.8 and PyTorch 1.7 installed and various other sensible best-practices.

Using GPUs

To use GPUs, add the gpu: true option to the build section of your cog.yaml:

build:
  gpu: true
  # ...

Cog will use the nvidia-docker base image and automatically figure out what versions of CUDA and cuDNN to use based on the version of Python, PyTorch, and Tensorflow that you are using.

Running commands

To run a command inside this environment, prefix it with cog run:

$ cog run python
✓ Building Docker image from cog.yaml... Successfully built 8f54020c8981
Running 'python' in Docker with the current directory mounted as a volume...
────────────────────────────────────────────────────────────────────────────────────────

Python 3.8.10 (default, May 12 2021, 23:32:14)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

This is handy for ensuring a consistent environment for development or training.

With cog.yaml, you can also install system packages and other things. Take a look at the full reference to see what else you can do.

Define how to run predictions

The next step is to update predict.py to define the interface for running predictions on your model. The predict.py generated by cog init looks something like this:

from cog import BasePredictor, Path, Input
import torch

class Predictor(BasePredictor):
    def setup(self):
        """Load the model into memory to make running multiple predictions efficient"""
        self.net = torch.load("weights.pth")

    def predict(self,
            image: Path = Input(description="Image to enlarge"),
            scale: float = Input(description="Factor to scale image by", default=1.5)
    ) -> Path:
        """Run a single prediction on the model"""
        # ... pre-processing ...
        output = self.net(input)
        # ... post-processing ...
        return output

Edit your predict.py file and fill in the functions with your own model's setup and prediction code. You might need to import parts of your model from another file.

You also need to define the inputs to your model as arguments to the predict() function, as demonstrated above. For each argument, you need to annotate with a type. The supported types are:

  • str: a string
  • int: an integer
  • float: a floating point number
  • bool: a boolean
  • cog.File: a file-like object representing a file
  • cog.Path: a path to a file on disk

You can provide more information about the input with the Input() function, as shown above. It takes these basic arguments:

  • description: A description of what to pass to this input for users of the model
  • default: A default value to set the input to. If this argument is not passed, the input is required. If it is explicitly set to None, the input is optional.
  • ge: For int or float types, the value should be greater than or equal to this number.
  • le: For int or float types, the value should be less than or equal to this number.
  • choices: For str or int types, a list of possible values for this input.

There are some more advanced options you can pass, too. For more details, take a look at the prediction interface documentation.

Next, add the line predict: "predict.py:Predictor" to your cog.yaml, so it looks something like this:

build:
  python_version: "3.8"
  python_packages:
    - "torch==1.7.0"
predict: "predict.py:Predictor"

That's it!

Test your model locally

To test this works, try running a prediction on the model:

$ cog predict -i image=@input.jpg
✓ Building Docker image from cog.yaml... Successfully built 664ef88bc1f4
✓ Model running in Docker image 664ef88bc1f4

Written output to output.png

To pass more inputs to the model, you can add more -i options:

$ cog predict -i image=@image.jpg -i scale=2.0

In this case it is just a number, not a file, so you don't need the @ prefix.

Push your model

Now that you've configured your model for use with Cog, it's time to publish it to the Replicate registry:

cog login
cog push r8.im/your-username/your-model

Note: You can also set the image property in your cog.yaml file. This allows you to run cog push without specifying the image, and also makes your Replicate model page more discoverable for folks reading your model's source code.

Run predictions

Once you've pushed your model to the registry it will be visible on the website, and you can use the web-based form to run predictions using your model.

Whenever you generate a prediction that you like, click the "Add to example gallery" button to display that output on your model page.

Share your model

Congratulations! You've now got a hosted machine learning model with a web-based demo that anyone can use! Now it's time to share it with the your friends, peers, and colleagues. Share a link to the model page, or use the "Share a link to this output" button to share the output of a specific prediction from your model.

Troubleshooting

Did something go wrong along the way? Let us know and we'll help. If you encountered a problem with Cog, you can file a GitHub issue.

Otherwise chat with us in Discord or send us an email at team@replicate.com.

Replicate Reproducible machine learning