kfarr/tencent-hy-world-2.0

Tencent WorldMirror 2.0: feed-forward 3D reconstruction from multi-view images or video

Public
5 runs

Run kfarr/tencent-hy-world-2.0 with an API

Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.

Input schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

Field Type Default value Description
input_file
string
A video file (mp4/mov/etc.) or a .zip archive of multi-view images. With a video, frames are extracted at the given fps.
target_size
integer
952

Min: 224

Max: 1568

Maximum resolution (longest edge). Images are resized and center-cropped to the nearest multiple of 14.
fps
integer
1

Min: 1

Max: 30

Frames-per-second to extract from a video input.
video_max_frames
integer
32

Min: 2

Max: 128

Maximum number of frames to use from a video input.
save_gaussians
boolean
True
Save 3D Gaussian splats (gaussians.ply).
save_points
boolean
True
Save dense point cloud (points.ply).
save_depth
boolean
True
Save per-view depth maps (PNG previews + .npy).
save_normal
boolean
True
Save per-view surface-normal maps.
save_camera
boolean
True
Save predicted camera parameters (camera_params.json).
apply_sky_mask
boolean
True
Mask out the sky region before reconstruction.
apply_edge_mask
boolean
True
Mask out unreliable depth/normal discontinuities.
compress_gs_max_points
integer
5000000

Min: 100000

Max: 20000000

Max number of gaussians to retain in the output PLY.
compress_pts_max_points
integer
2000000

Min: 100000

Max: 10000000

Max number of points to retain in points.ply.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{
  "type": "string",
  "title": "Output",
  "format": "uri"
}