A ControlNet model designed to enhance the temporal consistency of generated outputs
An English, monolingual embedding model supporting 8192 sequence length (137M version)
An English, monolingual embedding model supporting 8192 sequence length (33M version)
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)
LLaVA v1.6: Large Language and Vision Assistant (Vicuna-13B)
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.