You can now monitor how much GPU memory your deployments are using. The visualization shows the total memory available to your deployment, and a median and max usage over the last 2 or 24 hours across all your instances.
Take a look at replicate.com/deployments.