bytedance / sa2va-8b-image

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

  • Public
  • 298 runs
  • L40S
  • GitHub
  • Weights
  • Paper
  • License
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    8bma77cdyxrm80cn5g5tecbpcm
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    what is the giraffe doing?

    Output

    The giraffe is eating leaves from a tree.<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    rd2vw06bm5rma0cn5g5rmesp78
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created
    by @lucataco

    Input

    image
    image
    instruction
    segment the giraffe

    Output

    img

    img

    response

    Sure, [SEG] .<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    d6jbr991a1rma0cn5g6av83wvg
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created
    by @lucataco

    Input

    image
    image
    instruction
    caption this

    Output

    A snowboarder in a white jacket and red helmet is riding down a snowy slope, skillfully navigating the terrain.<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    4zhmqqvdkxrm80cn5g6aqpxcrm
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    segment the snowboarder

    Output

    img

    img

    response

    Sure, [SEG] .<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    10md0xdcjhrma0cn5g68mqfhx0
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    segment the dinosaur to the left of the sunglasses

    Output

    img

    img

    response

    Sure, [SEG] .<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    8n9kqjpj1xrma0cn5g6bvs479m
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    how many feathers are there?

    Output

    There are two feathers in the image.<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    q14aa81ek5rme0cn5g6vvp5h4g
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    how many ducks are there?

    Output

    There is one yellow rubber duck in the drawer.<|im_end|>
    Generated in
  • Prediction

    bytedance/sa2va-8b-image:956baf05a8a81ab47f1d0dac8eab6585b899790f342975a964840c4e9c63c7aa
    ID
    0aft2enjkhrme0cn5g6rtbma2g
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    please segment the yellow duck

    Output

    img

    img

    response

    Sure, [SEG] .<|im_end|>
    Generated in

Want to make some of these yourself?

Run this model