firtoz/trellis2

4B parameter model for image-to-3D with PBR materials. Generates 512-1536³ voxel assets with complex topology via O-Voxel. 3-stage: structure→shape→material. Supports environment lighting and GLB export.

This model has no enabled versions.

Readme

TRELLIS.2

Note: This Replicate deployment is maintained by firtoz, a fan of the TRELLIS project, and is not officially affiliated with Microsoft or the TRELLIS team. All rights, licenses, and intellectual property belong to Microsoft.

TRELLIS.2 is a state-of-the-art 4B parameter model for high-fidelity image-to-3D generation. It leverages a novel “field-free” sparse voxel structure termed O-Voxel to reconstruct and generate arbitrary 3D assets with complex topologies, sharp features, and full PBR materials.

TRELLIS.2 Teaser

✨ Key Features

1. High Quality, Resolution & Efficiency

4B-parameter model generates high-resolution fully textured assets with exceptional fidelity using vanilla DiTs and Sparse 3D VAE with 16× spatial downsampling.

Resolution Total Time* Breakdown (Shape + Material)
512³ ~3s 2s + 1s
1024³ ~17s 10s + 7s
1536³ ~60s 35s + 25s

<small>*Tested on NVIDIA H100 GPU.</small>

2. Arbitrary Topology Handling

The O-Voxel representation breaks the limits of iso-surface fields and robustly handles: - ✅ Open Surfaces (e.g., clothing, leaves) - ✅ Non-manifold Geometry - ✅ Internal Enclosed Structures

3. Rich Texture Modeling

Beyond basic colors, TRELLIS.2 models arbitrary surface attributes including Base Color, Roughness, Metallic, and Opacity, enabling photorealistic rendering and transparency support.

4. Minimalist Processing

Data processing is streamlined for instant conversions that are fully rendering-free and optimization-free: - < 10s (Single CPU): Textured Mesh → O-Voxel - < 100ms (CUDA): O-Voxel → Textured Mesh

📊 How This Replicate Model Works

This deployment provides easy access to TRELLIS.2 with the following features: - 3-Stage Generation: Sparse structure → Shape → Material - Multiple Resolutions: Choose 512, 1024, or 1536 voxel resolution - PBR Rendering: Generate videos with environment lighting (forest/sunset/courtyard) - GLB Export: Download production-ready 3D models with textures - Fine-Grained Control: Adjust guidance strength and steps for each stage

🎯 Input Format

  • Image: PNG or JPEG (preferably with transparent background for best results)
  • Resolution: 512, 1024, or 1536 voxel resolution
  • Seed: For reproducible results (optional)

📦 Output Options

  • Videos:
  • Normal map rendering (geometry visualization)
  • PBR rendering with environment lighting
  • Combined side-by-side comparison
  • GLB File: Textured 3D model ready for use in Blender, Unity, etc.
  • Configurable Quality: Adjust mesh decimation and texture resolution

💻 Example Usage

import replicate

output = replicate.run(
    "firtoz/trellis2:latest",
    input={
        "image": open("input_image.png", "rb"),
        "seed": 42,
        "resolution": "1024",
        "generate_video": True,
        "video_render_mode": "pbr",
        "envmap": "forest",
        "generate_model": True,

        # Stage 1: Sparse Structure Generation
        "ss_guidance_strength": 7.5,
        "ss_guidance_rescale": 0.7,
        "ss_sampling_steps": 12,
        "ss_rescale_t": 5.0,

        # Stage 2: Shape Generation
        "shape_slat_guidance_strength": 7.5,
        "shape_slat_guidance_rescale": 0.5,
        "shape_slat_sampling_steps": 12,
        "shape_slat_rescale_t": 3.0,

        # Stage 3: Material Generation
        "tex_slat_guidance_strength": 1.0,
        "tex_slat_guidance_rescale": 0.0,
        "tex_slat_sampling_steps": 12,
        "tex_slat_rescale_t": 3.0,

        # GLB Export Settings
        "decimation_target": 500000,
        "texture_size": 2048
    }
)

# Output contains video and/or GLB file
print(output)

🎛️ Parameter Guide

Generation Control

Resolution (512/1024/1536): - Higher = better quality but slower - 512³: Fast preview (~3s on H100) - 1024³: Balanced quality/speed (~17s on H100) - 1536³: Maximum quality (~60s on H100)

Seed: - Use same seed for reproducible results - Randomize for variety

Stage Parameters

Stage 1: Sparse Structure Generation - Controls initial 3D structure layout - guidance_strength (1.0-10.0): How closely to follow input - sampling_steps (1-50): More steps = higher quality

Stage 2: Shape Generation - Creates detailed geometry - Higher guidance = more faithful to structure

Stage 3: Material Generation - Applies PBR textures - Lower guidance often works better for materials

Rendering Options

video_render_mode: - normal: Geometry visualization - pbr: Photorealistic with lighting - both: Side-by-side comparison

envmap: Choose lighting environment - forest: Natural outdoor lighting - sunset: Warm dramatic lighting - courtyard: Bright architectural lighting

GLB Export

decimation_target: Target face count (100K-1M) - Lower = smaller files, less detail - Higher = larger files, more detail

texture_size: Texture resolution (1024-4096) - 2048 recommended for most cases - 4096 for maximum quality

📚 Citation

@article{xiang2025trellis2,
    title={Native and Compact Structured Latents for 3D Generation},
    author={Xiang, Jianfeng and Chen, Xiaoxue and Xu, Sicheng and Wang, Ruicheng and Lv, Zelong and Deng, Yu and Zhu, Hongyuan and Dong, Yue and Zhao, Hao and Yuan, Nicholas Jing and Yang, Jiaolong},
    journal={Tech report},
    year={2025}
}

⚖️ License

TRELLIS.2 is released under the MIT License. See the LICENSE for details.

Model created