You're looking at a specific version of this model. Jump to the model overview.
lucataco /interactiveomni-8b:6d19412f
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| seed |
integer
|
0
|
Random seed for reproducible sampling. Set to -1 to disable seeding.
|
| audio |
string
|
Optional audio clip (WAV/MP3/FLAC). Resampled to 24 kHz automatically.
|
|
| top_p |
number
|
0.8
Min: 0.01 Max: 1 |
Top-p nucleus sampling parameter.
|
| video |
string
|
Optional video clip (MP4/MOV/WebM) for video-grounded conversation.
|
|
| images |
array
|
Optional list of images (PNG/JPG/WebP) to provide visual context.
|
|
| prompt |
string
|
|
User text prompt. Leave blank when providing only media inputs.
|
| max_tiles |
integer
|
12
Min: 1 Max: 48 |
Maximum number of temporal tiles to sample when a video is provided.
|
| temperature |
number
|
0.7
Max: 2 |
Sampling temperature. Set to 0 for greedy decoding.
|
| system_prompt |
string
|
|
Optional system prompt. When audio output is enabled and this is left blank, a recommended prompt is injected automatically.
|
| max_new_tokens |
integer
|
512
Min: 32 Max: 2048 |
Maximum number of tokens to generate.
|
| frames_per_tile |
integer
|
8
Min: 1 Max: 32 |
Number of frames sampled per tile when processing video.
|
| enable_audio_output |
boolean
|
False
|
Return generated speech along with text output.
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'properties': {'audio': {'format': 'uri',
'nullable': True,
'title': 'Audio',
'type': 'string'},
'text': {'title': 'Text', 'type': 'string'}},
'required': ['text'],
'title': 'OmniOutput',
'type': 'object'}