Run time and cost

This model runs on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 70 seconds. The predict time for this model varies significantly based on the inputs.


Vicuna-13B is an open source chatbot based on LLaMA-13B. It was developed by training LLaMA-13B on user-shared conversations collected from ShareGPT. LLaMA is a new open-source language model from Meta Research that performs as well as comparable closed-source models. Using GPT-4 to evaluate model outputs, the developers of Vicuna-13B found that it not only outperforms comparable models like Stanford Alpaca, but also reaches 90% of the quality of OpenAI’s ChatGPT and Google Bard.