Preview - rendered architecture diagrams; full validated atlas + dataset coming.
Back to models

pix2struct:image-text-to-text

multimodal pix2struct image validation pending

Architecture diagram

Rendered with TorchLens / Graphviz

Open SVG
pix2struct:image-text-to-text architecture diagram