bytedance / sa2va-4b-image

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos (Updated 4 months ago)

  • Public
  • 30 runs
  • GitHub
  • Weights
  • Paper
  • License
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    688a7frg6nrme0cn5g1a7z9pmr
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    segment the snowboarder

    Output

    img

    img

    response

    Sure, [SEG].<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    hc4w2m2vg9rme0cn5g1s6jh1ym
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    caption this

    Output

    A snowboarder is captured mid-air, performing a jump on a snowy slope. The snowboarder is wearing a white jacket and black pants, with a red snowboard beneath them. The scene is set against a backdrop of a cloudy sky and snow-covered mountains, creating a visually stunning and adventurous atmosphere.<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    qj3xwqxeydrm80cn5g1vff1jgc
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    what kind of animal is it?

    Output

    The image features a giraffe, a tall and long-necked animal native to Africa.<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    2tz4kzppg1rma0cn5g1v7ngx4r
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    what is the giraffe doing?

    Output

    The giraffe is standing next to a tree, eating leaves from the tree branches. It is using its long neck to reach the leaves, which is a typical behavior for giraffes as they feed on the leaves, flowers, and fruits of trees and bushes in their natural habitat.<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    vwh1qc1z0xrma0cn5g28fjzxjr
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    segment the dinosaur to the left of the sunglasses

    Output

    img

    img

    response

    Sure, [SEG].<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    43d2wf2yenrme0cn5g2bbj9qsw
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    how many feathers are there?

    Output

    There are two feathers in the image.<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    7v61mpn50drmc0cn5g2ab4cnf0
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    segment the yellow duck

    Output

    img

    img

    response

    Sure, [SEG].<|end|>
    Generated in
  • Prediction

    bytedance/sa2va-4b-image:ccf8e58352ce5b2a6cbf5cb68b947ca3bc62009ce66e82071eeae14555414411
    ID
    6229hx66xsrma0cn5g29tb0qwr
    Status
    Succeeded
    Source
    Web
    Hardware
    L40S
    Total duration
    Created

    Input

    image
    image
    instruction
    how many ducks are there?

    Output

    There is one yellow rubber duck in the image.<|end|>
    Generated in

Want to make some of these yourself?

Run this model