cjwbw / pix2struct

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

  • Public
  • 6.1K runs
  • A100 (80GB)
  • GitHub
  • Paper
Iterate in playground
  1. e32d7748

    Latest