You're looking at a specific version of this model. Jump to the model overview.

prunaai /gemma-4-26b-a4b-fast:007dac37

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
message
string
Explain vLLM in one sentence
The user message to send to the model
images
array
[]
Images to send to the model
video
string
Video file to send to the model
video_fps
number
2

Min: 0.1

Max: 30

Frames per second to sample from the video
system_prompt
string
Optional system prompt to set the model's behavior
max_tokens
integer
2048

Min: 1

Max: 16384

Maximum number of tokens to generate
enable_thinking
boolean
False
Enable thinking mode (model reasons internally before answering)
temperature
number
0.6

Max: 2

Sampling temperature (higher = more creative, lower = more deterministic)
top_p
number
0.95

Max: 1

Nucleus sampling: only consider tokens with cumulative probability up to this value
presence_penalty
number
1

Min: 1

Max: 1.5

Presence penalty: penalize new tokens based on their presence in the text
max_visual_tokens
integer
280

Min: 70

Max: 1120

Vision token budget per image (higher = more detail, more compute). Supported: 70, 140, 280, 560, 1120

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'items': {'type': 'string'},
 'title': 'Output',
 'type': 'array',
 'x-cog-array-display': 'concatenate',
 'x-cog-array-type': 'iterator'}