joehoover / instructblip-vicuna13b

An instruction-tuned multi-modal model based on BLIP-2 and Vicuna-13B

Demo API Examples README Versions (c4c54e3c)

Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 2 seconds. The predict time for this model varies significantly based on the inputs.