Modify images using line art
Modify images using human pose
[Non-commerical] A multi-image visual language model
Generate panoramic images with text prompts
Multilingual E5-small language embedding model
Detect everything with language!
Base version of Mamba 1.4B, a 1.4 billion parameter state space language model
Generating object-level shape variations with Stable Diffusion
Image editing with Prompt-to-Prompt for SDXL
Mamba 2.8B state space language model fine tuned for chat
Base version of Mamba 790M, a 790 million parameter state space language model
Base version of Mamba 370M, a 370 million parameter state space language model
Edit real or generated images
Photorealism with RealVisXL V4.0 Lightning
Base version of Mamba 2.8B Slim Pyjama, a 2.8 billion parameter state space language model
Photorealism with RealVisXL V4.0
Realistic interior design with text and image inputs
Base version of Mamba 130M, a 130 million parameter state space language model
Modify images using depth maps
Text-guided image generation and editing
Image-Prompt Multi-view Diffusion for 3D Generation
Base version of Mamba 2.8B, a 2.8 billion parameter state space language model
Generates 3D assets from images
Monocular depth estimation
Generate videos from text prompts with Kandinsky-2.2
Editable image generation with MasaCtrl-SDXL
Image editing with Prompt-to-Prompt for RealVisXL-v3.0
Zero-shot speech synthesizer for text-to-speech and voice conversion
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Generates speech from text
Zero-shot / open vocabulary object detection
LEdits++ for image editing
Lightweight multimodal model for visual question answering, reasoning and captioning
PyTorch version of Lightweight OpenPose as introduced in "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose"
Generate texture for your mesh with text prompts
Performs document image classification, document parsing and document visual question answering
Flux lora, use "CNSTLL" to trigger
Multilingual speech translation that preserves original vocal style and prosody
Multi-view image generation with MVDream
Whole-body pose estimation
Detects objects in an image
Photorealism with Realistic Vision v6.0
Modify images using sketches
[Non-commercial] Generate texture for 3D assets using text descriptions
Multilingual E5-large language embedding model
Dual Aggregation Transformer for Image Super-Resolution
Nougat: Neural Optical Understanding for Academic Documents
Photorealism with RealVisXL V3.0 Turbo based on SDXL
Text-Guided Image Generation and Manipulation
E5-mistral-7b-instruct language embedding model
Performs speaker identity verification
Modify images using canny edges
Inst-Inpaint: Instructing to Remove Objects with Diffusion Models
Generate 3D assets using text descriptions
Fast text-to-3D Gaussian generation by bridging 2D and 3D diffusion models
Flux lora, use "in the style of FNTSYRCH" to trigger
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.