SOTA open-source model for chatting with videos and the newest model in the Qwen family
Generate high quality videos from a prompt
Finetuned E5 embeddings for instruct based on Mistral.
Flux finetuned for black and white line art.
Llama-3-8B finetuned with ReFT to hyperfocus on New Jersey, the Garden State, the best state, the only state!
GLM-4V is a multimodal model released by Tsinghua University that is competitive with GPT-4o and establishes a new SOTA on several benchmarks, including OCR.
An example using Garden State Llama to ReFT on the Golden Gate bridge.
Embed text with Qwen2-7b-Instruct
Best-in-class clothing virtual try on in the wild (non-commercial use only)
Convert scanned or electronic documents to markdown, very very very fast
Microsoft's tool to convert Office documents, PDFs, images, audio, and more to LLM-ready markdown.
TTS with the voice of Mel Medarda from Arcane, trained using Zonos-v0.1
SDXL finetuned on line art
make meow emojis!
Translate audio while keeping the original style, pronunciation and tone of your original audio.
Zonos-v0.1 beta, a SOTA text-to-speech Transformer model with extraordinary expressive range, built by Zyphra.
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.