✋ This model is not published yet.

You can claim this model if you're @mdork on GitHub. Contact us.

mdork / image2video-synthesis-using-cinns

image2video synthesis using cINNs

  • Public
  • 101 runs
  • GitHub
  • Paper
  • License
Run with an API

Stochastic Image-to-Video Synthesis using cINNs

Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR2021.

Arxiv | Project Page | Supplemental | Pretrained Models | BibTeX

Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Kosta Derpanis*, Björn Ommer*, CVPR 2021

tl;dr We present a framework for both stochastic and controlled image-to-video synthesis. We bridge the gap between the image and video domain using conditional invertible neural networks and account for the inherent ambiguity with a learned, dedicated scene dynamics representation.

Shout-outs

Thanks to everyone who makes their code and models available. In particular,

  • The decoder architecture is inspired by SPADE
  • The great work and code of Stochastic Latent Residual Video Prediction SRVP
  • The 3D encoder and discriminator are based on 3D-Resnet and spatial discriminator is adapted from PatchGAN
  • The metrics which were used LPIPS PyTorch FID FVD

BibTeX

@InProceedings{Dorkenwald_2021_CVPR,
    author    = {Dorkenwald, Michael and Milbich, Timo and Blattmann, Andreas and Rombach, Robin and Derpanis, Konstantinos G. and Ommer, Bjorn},
    title     = {Stochastic Image-to-Video Synthesis Using cINNs},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3742-3753}
}