You're looking at a specific version of this model. Jump to the model overview.

nvidia /nemotron-nano-v2-12b-vl:f4559446

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
top_p
number
1

Max: 1

Nucleus sampling top-p
video
string
Input video file (provide either images or video, not both)
images
array
List of input images (1-4 images supported)
prompt
string
Describe what you see in detail.
Text prompt or question about the media
video_fps
integer
1

Min: 1

Max: 30

Frames per second to extract from video (only used for video input)
temperature
number
0

Max: 2

Sampling temperature (0 for greedy decoding)
system_prompt
string
/no_think
System prompt (/no_think disables chain-of-thought reasoning)
max_new_tokens
integer

Min: 1

Max: 2048

Maximum number of tokens to generate (default: 1024 for images, 128 for videos)
repetition_penalty
number
1

Min: 1

Max: 2

Repetition penalty to reduce repetitive text (1.0 = no penalty)
video_pruning_rate
number
0.75

Max: 1

Video pruning rate for efficiency (0.0=no pruning, 1.0=max pruning, only used for video)

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'title': 'Output', 'type': 'string'}