NVIDIA H100 GPUs are here

Posted May 16, 2025 by

You can now run NVIDIA H100 GPUs on Replicate.

You can also now use 2x, 4x, and 8x configurations of A100s and L40S GPUs. These were previously only available in deployments, but now you can use them for regular models and training runs.

If you’ve been waiting to speed up your model or try something more powerful, now’s a good time.

H100 pricing

1x H100s are now available to everyone.

2x, 4x, and 8x H100s are currently reserved for committed spend contracts.

Email us at team@replicate.com if you want access.

HardwarePrice (per sec)Price (per hour)GPUGPU RAMCPURAM
H100$0.001525$5.491x80GB13x72GB
2x H100$0.003050$10.982x160GB
4x H100$0.006100$21.964x320GB
8x H100$0.012200$43.928x640GB

A100 pricing (2x, 4x, 8x)

These multi-GPU setups for A100s are now available for models (they were already available for deployments):

HardwarePrice (per sec)Price (per hour)GPUGPU RAMCPURAM
2x A100 (80GB)$0.002800$10.082x160GB20x288GB
4x A100 (80GB)$0.005600$20.164x320GB40x576GB
8x A100 (80GB)$0.011200$40.328x640GB80x960GB

See the full hardware pricing list for more details.

L40S pricing (2x, 4x, 8x)

These multi-GPU setups for L40S GPUs are now available for models (they were already available for deployments):

HardwarePrice (per sec)Price (per hour)GPUGPU RAMCPURAM
2x L40S$0.001950$7.022x96GB20x144GB
4x L40S$0.003900$14.044x192GB40x288GB
8x L40S$0.007800$28.088x384GB80x576GB

See the full hardware pricing list for more details.

Creating a new model using an H100 GPU

You can create a new model on the web or using the HTTP API.

Here’s a cURL command to create a new model that uses an H100 GPU:

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"owner": "my-username", "name": "my-model", "description": "An example model", "visibility": "private", "hardware": "gpu-h100"}' \
  https://api.replicate.com/v1/models

Listing available hardware via API

Here’s a cURL command to list available hardware for your account:

curl -s -X GET \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/hardware

This command outputs a list of all the hardware options available to you, and the names of the SKUs you can use in the hardware field when creating a new model via API:

[
  {
    "sku": "cpu",
    "name": "CPU"
  },
  {
    "sku": "gpu-a100-large",
    "name": "Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-2x",
    "name": "2x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-4x",
    "name": "4x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-8x",
    "name": "8x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-h100",
    "name": "Nvidia H100 GPU"
  },
  {
    "sku": "gpu-l40s",
    "name": "Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-2x",
    "name": "2x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-4x",
    "name": "4x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-8x",
    "name": "8x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-t4",
    "name": "Nvidia T4 GPU"
  }
]

Updating your deployments

If you’re using a deployment, you can update the hardware configuration to use H100s or any of these new multi-GPU setups.

You can edit your deployment configuration on the web or use the HTTP API.

If you’re not sure how to best configure your deployments, email us at support@replicate.com.