Readme
TRELLIS.2
Note: This Replicate deployment is maintained by firtoz, a fan of the TRELLIS project, and is not officially affiliated with Microsoft or the TRELLIS team. All rights, licenses, and intellectual property belong to Microsoft.
TRELLIS.2 is a state-of-the-art 4B parameter model for high-fidelity image-to-3D generation. It leverages a novel “field-free” sparse voxel structure termed O-Voxel to reconstruct and generate arbitrary 3D assets with complex topologies, sharp features, and full PBR materials.

✨ Key Features
1. High Quality, Resolution & Efficiency
4B-parameter model generates high-resolution fully textured assets with exceptional fidelity using vanilla DiTs and Sparse 3D VAE with 16× spatial downsampling.
| Resolution | Total Time* | Breakdown (Shape + Material) |
|---|---|---|
| 512³ | ~3s | 2s + 1s |
| 1024³ | ~17s | 10s + 7s |
| 1536³ | ~60s | 35s + 25s |
<small>*Tested on NVIDIA H100 GPU.</small>
2. Arbitrary Topology Handling
The O-Voxel representation breaks the limits of iso-surface fields and robustly handles: - ✅ Open Surfaces (e.g., clothing, leaves) - ✅ Non-manifold Geometry - ✅ Internal Enclosed Structures
3. Rich Texture Modeling
Beyond basic colors, TRELLIS.2 models arbitrary surface attributes including Base Color, Roughness, Metallic, and Opacity, enabling photorealistic rendering and transparency support.
4. Minimalist Processing
Data processing is streamlined for instant conversions that are fully rendering-free and optimization-free: - < 10s (Single CPU): Textured Mesh → O-Voxel - < 100ms (CUDA): O-Voxel → Textured Mesh
📊 How This Replicate Model Works
This deployment provides easy access to TRELLIS.2 with the following features: - 3-Stage Generation: Sparse structure → Shape → Material - Multiple Resolutions: Choose 512, 1024, or 1536 voxel resolution - PBR Rendering: Generate videos with environment lighting (forest/sunset/courtyard) - GLB Export: Download production-ready 3D models with textures - Fine-Grained Control: Adjust guidance strength and steps for each stage
🎯 Input Format
- Image: PNG or JPEG (preferably with transparent background for best results)
- Resolution: 512, 1024, or 1536 voxel resolution
- Seed: For reproducible results (optional)
📦 Output Options
- Videos:
- Normal map rendering (geometry visualization)
- PBR rendering with environment lighting
- Combined side-by-side comparison
- GLB File: Textured 3D model ready for use in Blender, Unity, etc.
- Configurable Quality: Adjust mesh decimation and texture resolution
💻 Example Usage
import replicate
output = replicate.run(
"firtoz/trellis2:latest",
input={
"image": open("input_image.png", "rb"),
"seed": 42,
"resolution": "1024",
"generate_video": True,
"video_render_mode": "pbr",
"envmap": "forest",
"generate_model": True,
# Stage 1: Sparse Structure Generation
"ss_guidance_strength": 7.5,
"ss_guidance_rescale": 0.7,
"ss_sampling_steps": 12,
"ss_rescale_t": 5.0,
# Stage 2: Shape Generation
"shape_slat_guidance_strength": 7.5,
"shape_slat_guidance_rescale": 0.5,
"shape_slat_sampling_steps": 12,
"shape_slat_rescale_t": 3.0,
# Stage 3: Material Generation
"tex_slat_guidance_strength": 1.0,
"tex_slat_guidance_rescale": 0.0,
"tex_slat_sampling_steps": 12,
"tex_slat_rescale_t": 3.0,
# GLB Export Settings
"decimation_target": 500000,
"texture_size": 2048
}
)
# Output contains video and/or GLB file
print(output)
🎛️ Parameter Guide
Generation Control
Resolution (512/1024/1536): - Higher = better quality but slower - 512³: Fast preview (~3s on H100) - 1024³: Balanced quality/speed (~17s on H100) - 1536³: Maximum quality (~60s on H100)
Seed: - Use same seed for reproducible results - Randomize for variety
Stage Parameters
Stage 1: Sparse Structure Generation
- Controls initial 3D structure layout
- guidance_strength (1.0-10.0): How closely to follow input
- sampling_steps (1-50): More steps = higher quality
Stage 2: Shape Generation - Creates detailed geometry - Higher guidance = more faithful to structure
Stage 3: Material Generation - Applies PBR textures - Lower guidance often works better for materials
Rendering Options
video_render_mode:
- normal: Geometry visualization
- pbr: Photorealistic with lighting
- both: Side-by-side comparison
envmap: Choose lighting environment
- forest: Natural outdoor lighting
- sunset: Warm dramatic lighting
- courtyard: Bright architectural lighting
GLB Export
decimation_target: Target face count (100K-1M) - Lower = smaller files, less detail - Higher = larger files, more detail
texture_size: Texture resolution (1024-4096) - 2048 recommended for most cases - 4096 for maximum quality
📚 Citation
@article{xiang2025trellis2,
title={Native and Compact Structured Latents for 3D Generation},
author={Xiang, Jianfeng and Chen, Xiaoxue and Xu, Sicheng and Wang, Ruicheng and Lv, Zelong and Deng, Yu and Zhu, Hongyuan and Dong, Yue and Zhao, Hao and Yuan, Nicholas Jing and Yang, Jiaolong},
journal={Tech report},
year={2025}
}
⚖️ License
TRELLIS.2 is released under the MIT License. See the LICENSE for details.