prunaai/gpt-oss-120b-fast:e994aeeb | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
message	string	Explain vLLM in one sentence	The user message to send to the model
system_prompt	string		Optional system prompt to set the model's behavior
max_tokens	integer	2048 Min: 1 Max: 16384	Maximum number of tokens to generate
temperature	number	0.7 Max: 2	Sampling temperature (higher = more creative, lower = more deterministic)
top_p	number	0.95 Max: 1	Nucleus sampling: only consider tokens with cumulative probability up to this value

The shape of the response you’ll get when you run this model with an API.

Schema

{'items': {'type': 'string'},
 'title': 'Output',
 'type': 'array',
 'x-cog-array-display': 'concatenate',
 'x-cog-array-type': 'iterator'}