StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
DUSt3R: Geometric 3D Vision Made Easy
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Visual Style Prompting with Swapping Self-Attention
Hand Refiner 512x512
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
MetaVoice-1B: 1.2B parameter base model trained on 100K hours of speech
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
ControlNet Line Art Anime
Mixtral-8x22b-v0.1-4bit
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
AnimateDiff-Lightning: Cross-Model Diffusion Distillation
Mixtral 8x22b v0.1 Zephyr Orpo 141b A35b v0.1
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
This model is cold. You'll get a fast response if the model is warm and already running, and a slower response if the model is cold and starting up.
This model runs on L40S. View more.