Ollama Llama 3.2 Vision 90B
Remove background from an image
QVQ-72B-Preview by Qwen is an experimental research model focusing on enhancing visual reasoning capabilities
Jennai: an AI Avatar trained via the replicate/fast-flux-trainer
Fuyu-8B is a multi-modal text and image transformer trained by Adept AI
In-Context LoRA with Image-to-Image and Inpainting to apply your logo to anything
QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning
Realistic Vision v5.0 Image 2 Image
Convert musubi-tuner LoRA to ComfyUI compatible format
Flux finetune of the style: TIMES 100 Most Influential People in AI
Ollama QwQ 32B
Implementation of SDXL RealVisXL_V1.0
Apollo 3B - An Exploration of Video Understanding in Large Multimodal Models
AnimateDiff video to video
Segmind-Vega Model is a distilled version of SDXL, offering a 70% reduction in size and an 100% speedup
Extract the first or last frame from any video file as a high-quality image
Empowering Code Large Language Models with Evol-Instruct
openai/clip-vit-large-patch32
CUDA implementation of an rgb2grayscale function
A release preview of the olmOCR model from Ai2 that's fine tuned from Qwen2-VL-7B-Instruct using the olmOCR-mix-0225 dataset
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.
This model runs on A100 (80GB). View more.