Language independent neuro-symbolic semantic parsing for form understanding

Bhanu Prakash Voutharoja, Lizhen Qu, Fatemeh Shiri

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Recent works on form understanding mostly employ multimodal transformers or large-scale pre-trained language models. These models need ample data for pre-training. In contrast, humans can usually identify key-value pairings from a form only by looking at layouts, even if they don’t comprehend the language used. No prior research has been conducted to investigate how helpful layout information alone is for form understanding. Hence, we propose a unique entity-relation graph parsing method for scanned forms called LaGNN, a language-independent Graph Neural Network model. Our model parses a form into a word-relation graph in order to identify entities and relations jointly and reduce the time complexity of inference. This graph is then transformed by deterministic rules into a fully connected entity-relation graph. Our model simply takes into account relative spacing between bounding boxes from layout information to facilitate easy transfer across languages. To further improve the performance of LaGNN, and achieve isomorphism between entity-relation graphs and word-relation graphs, we use integer linear programming (ILP) based inference. Code is publicly available at https://github.com/Bhanu068/LAGNN.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2023 - 17th International Conference San José, CA, USA, August 21–26, 2023 Proceedings, Part II
EditorsGernot A. Fink, Rajiv Jain, Koichi Kise, Richard Zanibbi
Place of PublicationCham Switzerland
PublisherSpringer
Pages130-146
Number of pages17
ISBN (Electronic)9783031416798
ISBN (Print)9783031416781
DOIs
Publication statusPublished - 2023
EventInternational Conference on Document Analysis and Recognition (ICDAR) 2023 - San Jose, United States of America
Duration: 21 Aug 202326 Aug 2023
Conference number: 17th
https://link.springer.com/book/10.1007/978-3-031-41679-8 (Proceedings - 2)
https://icdar2023.org/ (Website)

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume14188
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Document Analysis and Recognition (ICDAR) 2023
Abbreviated titleICDAR 2023
Country/TerritoryUnited States of America
CitySan Jose
Period21/08/2326/08/23
Internet address

Keywords

  • Deep Learning
  • Document Layout Analysis
  • Graph Neural Network
  • Language Independent

Cite this