cjwbw / instructcv

Instruction tuned text-to-image diffusion models as vision generalists

  • Public
  • 356 runs
  • L40S
  • GitHub
  • Paper
  • License
  • Prediction

    cjwbw/instructcv:3258454a
    ID
    4io4o3dbkmossdei7vqlkn4vha
    Status
    Succeeded
    Source
    Web
    Hardware
    A40 (Large)
    Total duration
    Created

    Input

    image
    image
    instruction
    Detect Berkeley's Sather tower.
    text_guidance_scale
    7.5
    image_guidance_scale
    1.5

    Output

    output
    Generated in
  • Prediction

    cjwbw/instructcv:3258454a
    ID
    fru3ssdbqyregt4dq7b4cyq6ae
    Status
    Succeeded
    Source
    Web
    Hardware
    A40 (Large)
    Total duration
    Created

    Input

    image
    image
    instruction
    Create a monocular depth map
    num_inference_steps
    50
    text_guidance_scale
    7.5
    image_guidance_scale
    1.5

    Output

    output
    Generated in
  • Prediction

    cjwbw/instructcv:3258454a
    ID
    yxc5tz3bgvgmloo2nnlugdyjny
    Status
    Succeeded
    Source
    Web
    Hardware
    A40 (Large)
    Total duration
    Created

    Input

    image
    image
    instruction
    Segment all trees
    num_inference_steps
    50
    text_guidance_scale
    7.5
    image_guidance_scale
    1.5

    Output

    output
    Generated in
  • Prediction

    cjwbw/instructcv:3258454a
    ID
    yt5uvqdbhhzo3wjoellavpaklq
    Status
    Succeeded
    Source
    Web
    Hardware
    A40 (Large)
    Total duration
    Created

    Input

    image
    image
    instruction
    Detect the great dome
    num_inference_steps
    50
    text_guidance_scale
    7.5
    image_guidance_scale
    1.5

    Output

    output
    Generated in

Want to make some of these yourself?

Run this model