nvidia/nemotron-3-nano-30b-a3b:135b4a9c | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
prompt	string		Input prompt for the model
system_prompt	string		System prompt to guide model behavior (optional)
max_new_tokens	integer	256 Min: 1 Max: 8192	Maximum number of tokens to generate
temperature	number	1 Max: 2	Temperature for sampling. Use 1.0 for reasoning tasks, 0.6 for tool calling
top_p	number	1 Max: 1	Top-p (nucleus) sampling. Use 1.0 for reasoning tasks, 0.95 for tool calling
top_k	integer	50 Max: 100	Top-k sampling. Lower values make output more focused
repetition_penalty	number	1.1 Min: 1 Max: 2	Penalty for repeating tokens. Higher values reduce repetition
enable_thinking	boolean	True	Enable reasoning/thinking mode for complex problems. Set to False for greedy search

The shape of the response you’ll get when you run this model with an API.

Schema

{'items': {'type': 'string'},
 'title': 'Output',
 'type': 'array',
 'x-cog-array-display': 'concatenate',
 'x-cog-array-type': 'iterator'}