Models that improve or restore images by deblurring, colorization, and removing noise

Upscaling models that create high-quality images from low-quality images

Language models that you can fine-tune using Replicate's training API.

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Latest models

OpenBMB MiniCPM-V 2.8B is a strong multimodal large language model for efficient end-side deployment

A tiny model for testing out Cog

a powerful and competitive model like Midjourney v6 and DALL-E 3 but Open and Decentralized

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

PyTorch implementation of AnimeGAN for fast photo animation

AbsoluteReality V1.8.1 Model (Text2Img, Img2Img and Inpainting)

An example of a rudimentary Q&A assistant for ACME SL

ZeST: Zero-Shot Material Transfer from a Single Image

Reliberate v3 Model (Text2Img, Img2Img and Inpainting)

Deliberate V6 Model (Text2Img, Img2Img and Inpainting)

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

WizardLM 2 8x22B

lightweight text-to-speech (TTS) model, trained on 10.5K hours of audio data

Accelerated transcription, word-level timestamps and diarization with whisperX large-v3

txt2img model based on photon-v1 checkpoint model

Change eye (iris) color

Mixtral 8x22b v0.1 Zephyr Orpo 141b A35b v0.1

Midjourney v6 text-to-image quality model but Open and Decentralized

Zero-Shot Speech Editing and Text-to-Speech in the Wild

GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.

Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

A large, stereo MusicGen that acts as a useful tool for music producers

Nous Hermes 2 Mixtral 8x7B DPO is a Nous Research model trained over the Mixtral 8x7B MoE LLM

High resolution image Upscaler and Enhancer. Use at ClarityAI.cc. A free Magnific alternative. Twitter/X: @philz1337x

Best-in-class virtual try on in the wild

Image generation, Added: inpaint_strength loras_custom_urls

Use a subset of https://github.com/barun-saha/slide-deck-ai to create powerpoint slides from a json description - using python-pptx (https://github.com/scanny/python-pptx)

Generates Images in the Big Medium Style

multilingual text2image latent diffusion model

viⓍTTS vixTTS là mô hình tạo sinh giọng nói cho phép bạn sao chép giọng nói sang các ngôn ngữ khác nhau chỉ bằng cách sử dụng một đoạn âm thanh nhanh dài 6 giây

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Turn a face into 3D, emoji, pixel art, video game, claymation or toy

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

Free Lunch towards Style-Preserving in Text-to-Image Generation by InstantX team, with ControlNet

繁花 style 测试

Free Lunch towards Style-Preserving in Text-to-Image Generation by InstantX team

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens

Newest balance-striking reranker model from BAAI. Outputs rank scores for query-doc pairs. FP16 inference enabled.

Open Sora Plan Text To Video

Domain Consistent Resolution Adapter for Diffusion Models: generating consistent images with resolutions outside of their trained domain

