Abstract
With the aim of promoting and understanding the multilingual version of image search, we leverage visual object detection and propose a model with diverse multi-head attention to learn grounded multilingual multimodal representations. Specifically, our model attends to different types of textual semantics in two languages and visual objects for fine-grained alignments between sentences and images. We introduce a new objective function which explicitly encourages attention diversity to learn an improved visual-semantic embedding space. We evaluate our model in the German-Image and English-Image matching tasks on the Multi30K dataset, and in the Semantic Textual Similarity task with the English descriptions of visual content. Results show that our model yields a significant performance gain over other methods in all of the three tasks.
Original language | English |
---|---|
Title of host publication | EMNLP-IJCNLP 2019, 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing |
Subtitle of host publication | Proceedings of the Conference |
Editors | Jing Jiang, Vincent Ng, Xiaojun Wan |
Place of Publication | Stroudsburg PA USA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 1461-1467 |
Number of pages | 7 |
ISBN (Electronic) | 9781950737901 |
DOIs | |
Publication status | Published - 2019 |
Event | Joint Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing 2019 - Hong Kong, China Duration: 3 Nov 2019 → 7 Nov 2019 Conference number: 9th https://www.emnlp-ijcnlp2019.org (Website) https://www.aclweb.org/anthology/volumes/D19-1/ (Proceedings) |
Conference
Conference | Joint Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing 2019 |
---|---|
Abbreviated title | EMNLP-IJCNLP 2019 |
Country/Territory | China |
City | Hong Kong |
Period | 3/11/19 → 7/11/19 |
Internet address |
|