This is a cog implementation of https://github.com/google-research/maskgit
MaskGIT: Masked Generative Image Transformer
Official Jax Implementation of the CVPR 2022 Paper
Summary
MaskGIT is a novel image synthesis paradigm using a bidirectional transformer decoder. During training, MaskGIT learns to predict randomly masked tokens by attending to tokens in all directions. At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.
BibTeX
@InProceedings{chang2022maskgit,
title = {MaskGIT: Masked Generative Image Transformer},
author={Huiwen Chang and Han Zhang and Lu Jiang and Ce Liu and William T. Freeman},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022}
}