You're looking at a specific version of this model. Jump to the model overview.
prunaai /gemma-4-26b-a4b-fast:007dac37
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| message |
string
|
Explain vLLM in one sentence
|
The user message to send to the model
|
| images |
array
|
[]
|
Images to send to the model
|
| video |
string
|
Video file to send to the model
|
|
| video_fps |
number
|
2
Min: 0.1 Max: 30 |
Frames per second to sample from the video
|
| system_prompt |
string
|
Optional system prompt to set the model's behavior
|
|
| max_tokens |
integer
|
2048
Min: 1 Max: 16384 |
Maximum number of tokens to generate
|
| enable_thinking |
boolean
|
False
|
Enable thinking mode (model reasons internally before answering)
|
| temperature |
number
|
0.6
Max: 2 |
Sampling temperature (higher = more creative, lower = more deterministic)
|
| top_p |
number
|
0.95
Max: 1 |
Nucleus sampling: only consider tokens with cumulative probability up to this value
|
| presence_penalty |
number
|
1
Min: 1 Max: 1.5 |
Presence penalty: penalize new tokens based on their presence in the text
|
| max_visual_tokens |
integer
|
280
Min: 70 Max: 1120 |
Vision token budget per image (higher = more detail, more compute). Supported: 70, 140, 280, 560, 1120
|
Output schema
The shape of the response you’ll get when you run this model with an API.
{'items': {'type': 'string'},
'title': 'Output',
'type': 'array',
'x-cog-array-display': 'concatenate',
'x-cog-array-type': 'iterator'}