Abstract
Generating an image from its description is a challenging task worth solving because of its numerous practical applications ranging from image editing to virtual reality. All existing methods use one single caption to generate a plausible image. A single caption by itself, can be limited and may not be able to capture the variety of concepts and behavior that would be present in the image. We propose two deep generative models that generate an image by making use of multiple captions describing it. This is achieved by ensuring ‘Cross-Caption Cycle Consistency’ between the multiple captions and the generated image(s). We report quantitative and qualitative results on the standard Caltech-UCSD Birds (CUB) and Oxford-102 Flowers datasets to validate the efficacy of the proposed approach.
Original language | English |
---|---|
Title of host publication | Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision WACV 2018 |
Editors | Michael Brown, Yanxi Liu, Peyman Milanfar |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 358-366 |
Number of pages | 9 |
ISBN (Electronic) | 9781728119755 |
ISBN (Print) | 9781728119762 |
DOIs | |
Publication status | Published - 2019 |
Externally published | Yes |
Event | IEEE Winter Conference on Applications of Computer Vision 2019 - Waikoloa Village, United States of America Duration: 7 Jan 2019 → 11 Jan 2019 Conference number: 19th https://wacv19.wacv.net/ (Website) https://ieeexplore.ieee.org/xpl/conhome/8642793/proceeding (Proceedings) |
Conference
Conference | IEEE Winter Conference on Applications of Computer Vision 2019 |
---|---|
Abbreviated title | WACV 2019 |
Country/Territory | United States of America |
City | Waikoloa Village |
Period | 7/01/19 → 11/01/19 |
Internet address |
|