lucataco/interactiveomni-8b:6d19412f | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

lucataco /interactiveomni-8b:6d19412f

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
seed	integer	0	Random seed for reproducible sampling. Set to -1 to disable seeding.
audio	string		Optional audio clip (WAV/MP3/FLAC). Resampled to 24 kHz automatically.
top_p	number	0.8 Min: 0.01 Max: 1	Top-p nucleus sampling parameter.
video	string		Optional video clip (MP4/MOV/WebM) for video-grounded conversation.
images	array		Optional list of images (PNG/JPG/WebP) to provide visual context.
prompt	string		User text prompt. Leave blank when providing only media inputs.
max_tiles	integer	12 Min: 1 Max: 48	Maximum number of temporal tiles to sample when a video is provided.
temperature	number	0.7 Max: 2	Sampling temperature. Set to 0 for greedy decoding.
system_prompt	string		Optional system prompt. When audio output is enabled and this is left blank, a recommended prompt is injected automatically.
max_new_tokens	integer	512 Min: 32 Max: 2048	Maximum number of tokens to generate.
frames_per_tile	integer	8 Min: 1 Max: 32	Number of frames sampled per tile when processing video.
enable_audio_output	boolean	False	Return generated speech along with text output.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'properties': {'audio': {'format': 'uri',
                          'nullable': True,
                          'title': 'Audio',
                          'type': 'string'},
                'text': {'title': 'Text', 'type': 'string'}},
 'required': ['text'],
 'title': 'OmniOutput',
 'type': 'object'}