Pluralistic free-form image completion

Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai

Research output: Contribution to journalArticleResearchpeer-review

15 Citations (Scopus)

Abstract

Image completion involves filling plausible contents to missing regions in images. Current image completion methods produce only one result for a given masked image, although there may be many reasonable possibilities. In this paper, we present an approach for pluralistic image completion—the task of generating multiple and diverse plausible solutions for free-form image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label for this multi-output problem. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one ground truth to get prior distribution of missing patches and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by adversarial learning. We then introduce a new short+long term patch attention layer that exploits distant relations among decoder and encoder features, to improve appearance consistency between the original visible and the generated new regions. Experiments show that our method not only yields better results in various datasets than existing state-of-the-art methods, but also provides multiple and diverse outputs.

Original languageEnglish
Pages (from-to)2786-2805
Number of pages20
JournalInternational Journal of Computer Vision
Volume129
DOIs
Publication statusPublished - 30 Jul 2021

Keywords

  • Conditional variational auto-encoders
  • Image completion
  • Image generation
  • Multi-modal generative models

Cite this