FACTUAL: A benchmark for faithful and consistent textual scene graph parsing

Zhuang Li, Yuyang Chai, Terry Zhuo, Lizhen Qu, Reza Haffari

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Textual scene graph parsing has become increasingly important in various vision-language applications, including image caption evaluation and image retrieval. However, existing scene graph parsers that convert image captions into scene graphs often suffer from two types of errors. First, the generated scene graphs fail to capture the true semantics of the captions or the corresponding images, resulting in a lack of faithfulness. Second, the generated scene graphs have high inconsistency, with the same semantics represented by different annotations.To address these challenges, we propose a novel dataset, which involves re-annotating the captions in Visual Genome (VG) using a new intermediate representation called FACTUAL-MR. FACTUAL-MR can be directly converted into faithful and consistent scene graph annotations. Our experimental results clearly demonstrate that the parser trained on our dataset outperforms existing approaches in terms of faithfulness and consistency. This improvement leads to a significant performance boost in both image caption evaluation and zero-shot image retrieval tasks. Furthermore, we introduce a novel metric for measuring scene graph similarity, which, when combined with the improved scene graph parser, achieves state-of-the-art (SOTA) results on multiple benchmark datasets for the aforementioned tasks.
Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics: ACL 2023
EditorsAnna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages6377–6390
Number of pages14
ISBN (Electronic)9781959429623
DOIs
Publication statusPublished - 2023
EventAnnual Meeting of the Association of Computational Linguistics 2023 - Toronto, Canada
Duration: 9 Jul 202314 Jul 2023
Conference number: 61st
https://aclanthology.org/volumes/2023.acl-long/ (Proceedings - 1)
https://aclanthology.org/volumes/2023.findings-acl/ (Proceedings - 2)
https://2023.aclweb.org/ (Website)

Conference

ConferenceAnnual Meeting of the Association of Computational Linguistics 2023
Abbreviated titleACL 2023
Country/TerritoryCanada
CityToronto
Period9/07/2314/07/23
Internet address

Cite this