daanelson / imagebind

A model for text, audio, and image embeddings in one space

Demo API Examples README Versions (0383f62e)

Run time and cost

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 1 seconds.