You're looking at a specific version of this model. Jump to the model overview.
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| input_image |
string
|
Input image.
|
|
| input_audio |
string
|
Input audio.
|
|
| input_video |
string
|
Input video.
|
|
| task_type |
None
|
Image Captioning
|
Choose a task.
|
| instruction |
string
|
Provide question for the VQA task, region for Visual Grounding task, and instruction for General tasks. The default instruction for Captioning task is 'What does the image/video/audio describe?'
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'title': 'Output'}