NVIDIA H100 GPUs are here – Replicate blog

You can now run NVIDIA H100 GPUs on Replicate.

You can also now use 2x, 4x, and 8x configurations of A100s and L40S GPUs. These were previously only available in deployments, but now you can use them for regular models and training runs.

If you’ve been waiting to speed up your model or try something more powerful, now’s a good time.

H100 pricing

1x H100s are now available to everyone.

2x, 4x, and 8x H100s are currently reserved for committed spend contracts.

Email us at team@replicate.com if you want access.

Hardware	Price (per sec)	Price (per hour)	GPU	GPU RAM	CPU	RAM
H100	$0.001525	$5.49	1x	80GB	13x	72GB
2x H100	$0.003050	$10.98	2x	160GB	–	–
4x H100	$0.006100	$21.96	4x	320GB	–	–
8x H100	$0.012200	$43.92	8x	640GB	–	–

A100 pricing (2x, 4x, 8x)

These multi-GPU setups for A100s are now available for models (they were already available for deployments):

Hardware	Price (per sec)	Price (per hour)	GPU	GPU RAM	CPU	RAM
2x A100 (80GB)	$0.002800	$10.08	2x	160GB	20x	288GB
4x A100 (80GB)	$0.005600	$20.16	4x	320GB	40x	576GB
8x A100 (80GB)	$0.011200	$40.32	8x	640GB	80x	960GB

See the full hardware pricing list for more details.

L40S pricing (2x, 4x, 8x)

These multi-GPU setups for L40S GPUs are now available for models (they were already available for deployments):

Hardware	Price (per sec)	Price (per hour)	GPU	GPU RAM	CPU	RAM
2x L40S	$0.001950	$7.02	2x	96GB	20x	144GB
4x L40S	$0.003900	$14.04	4x	192GB	40x	288GB
8x L40S	$0.007800	$28.08	8x	384GB	80x	576GB

See the full hardware pricing list for more details.

Creating a new model using an H100 GPU

You can create a new model on the web or using the HTTP API.

Here’s a cURL command to create a new model that uses an H100 GPU:

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"owner": "my-username", "name": "my-model", "description": "An example model", "visibility": "private", "hardware": "gpu-h100"}' \
  https://api.replicate.com/v1/models

Listing available hardware via API

Here’s a cURL command to list available hardware for your account:

curl -s -X GET \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/hardware

This command outputs a list of all the hardware options available to you, and the names of the SKUs you can use in the hardware field when creating a new model via API:

[
  {
    "sku": "cpu",
    "name": "CPU"
  },
  {
    "sku": "gpu-a100-large",
    "name": "Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-2x",
    "name": "2x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-4x",
    "name": "4x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-a100-large-8x",
    "name": "8x Nvidia A100 (80GB) GPU"
  },
  {
    "sku": "gpu-h100",
    "name": "Nvidia H100 GPU"
  },
  {
    "sku": "gpu-l40s",
    "name": "Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-2x",
    "name": "2x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-4x",
    "name": "4x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-l40s-8x",
    "name": "8x Nvidia L40S GPU"
  },
  {
    "sku": "gpu-t4",
    "name": "Nvidia T4 GPU"
  }
]

Updating your deployments

If you’re using a deployment, you can update the hardware configuration to use H100s or any of these new multi-GPU setups.

You can edit your deployment configuration on the web or use the HTTP API.

If you’re not sure how to best configure your deployments, email us at support@replicate.com.