Table of contents
GPUs are specialized processors that are designed to handle complex mathematical calculations. Many machine learning models will only run on a computer with a GPU. While GPUs are incredibly powerful, setting up a machine that can use them can be challenging. GPUs require specific drivers and software to work properly, which can be difficult to install and configure.
In this guide, you'll learn how to get your own GPU machine in the cloud, so you can package your model and push it to Replicate.
Lambda Labs is a cloud provider that offers GPU machines that come preconfigured with Docker and NVIDIA drivers, which makes them a great fit for working with Cog.
Create an account at lambdalabs.com/service/gpu-cloud and enter your billing info. You'll be able to run GPU machines for as little as $0.50/hour.
Once you've got a Lambda account, create a new GPU Cloud instance. You'll be asked to specify three settings:
1x A10 (24 GB PCIe)
. Start by choosing the smallest instance type. You can upgrade to a larger instance type later if you need more power.us-west-1
)". Choose the region closest to you.Next you'll be asked to provide your public SSH key so you can easily log into your new instance using SSH. If you've already set up your SSH keys for another service like GitHub, you can use your existing public key. Use a command like this to copy your public key to your clipboard:
If you don't have one already, check out GitHub's docs for generating an SSH key.
Your GPU Cloud instance will be launched in a few minutes. Once it's ready, you can access it through SSH or JupyterLab.
To SSH into your instance, copy the "SSH login" command from your Lambda dashboard, then run it:
To access your instance using JupyterLab, click the "Launch" button beside your new instance in the Lambda dashboard.
Cog is Replicate's open-source tool that makes it easy to put a machine learning model in a Docker container. Cog is the tool you use to package your trained model and push it to Replicate.
Using the terminal (either from your SSH sesion or inside JupyterLab), run the following command to install Cog on your instance:
To verify that your new instance is working properly, you can run a prediction on an existing model on Replicate.
Run the following commmand in the terminal to download the Stable Diffusion model and run it locally on your new instance:
👆 Note: It's important to use sudo
here so Cog can work properly with the Docker installation on your instance.
JupyterLab is a web-based editor that makes it easy to run models interactively and view the files on your instance. Lambda's GPU Cloud instances are preconfigured with JupyterLab.
To access JupyterLab, click the "Launch" button beside your new instance in the Lambda dashboard.
You should see your output file in the JupyterLab file browser. Click on it to view the output.
You've now got a working GPU machine in the cloud!
Now it's time to build your own model and push it to Replicate.
Lambda's GPU Cloud instances remain active until you terminate them, so you'll be charged for them until you shut them down. To terminate your instance, go to the Lambda dashboard and click "Terminate" on your instance.