cjwbw / pix2struct

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

  • Public
  • 5.9K runs
  • GitHub
  • Paper

Want to make some of these yourself?

Run this model