You're looking at a specific version of this model. Jump to the model overview.

prunaai /gemma-4-26b-a4b-fast:6992a45e

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
message
string
Explain vLLM in one sentence
The user message to send to the model
image
string
Image file to send to the model
video
string
Video file to send to the model
video_fps
number
2

Min: 0.1

Max: 30

Frames per second to sample from the video
system_prompt
string
Optional system prompt to set the model's behavior
max_tokens
integer
2048

Min: 1

Max: 16384

Maximum number of tokens to generate
enable_thinking
boolean
False
Enable thinking mode (model reasons internally before answering)
temperature
number
0.6

Max: 2

Sampling temperature (higher = more creative, lower = more deterministic)
top_p
number
0.95

Max: 1

Nucleus sampling: only consider tokens with cumulative probability up to this value
presence_penalty
number
1

Min: 1

Max: 1.5

Presence penalty: penalize new tokens based on their presence in the text
max_visual_tokens
integer
280

Min: 70

Max: 1120

Vision token budget per image (higher = more detail, more compute). Supported: 70, 140, 280, 560, 1120

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'items': {'type': 'string'},
 'title': 'Output',
 'type': 'array',
 'x-cog-array-display': 'concatenate',
 'x-cog-array-type': 'iterator'}