Abstract
Recent works on form understanding mostly employ multimodal transformers or large-scale pre-trained language models. These models need ample data for pre-training. In contrast, humans can usually identify key-value pairings from a form only by looking at layouts, even if they don’t comprehend the language used. No prior research has been conducted to investigate how helpful layout information alone is for form understanding. Hence, we propose a unique entity-relation graph parsing method for scanned forms called LaGNN, a language-independent Graph Neural Network model. Our model parses a form into a word-relation graph in order to identify entities and relations jointly and reduce the time complexity of inference. This graph is then transformed by deterministic rules into a fully connected entity-relation graph. Our model simply takes into account relative spacing between bounding boxes from layout information to facilitate easy transfer across languages. To further improve the performance of LaGNN, and achieve isomorphism between entity-relation graphs and word-relation graphs, we use integer linear programming (ILP) based inference. Code is publicly available at https://github.com/Bhanu068/LAGNN.
Original language | English |
---|---|
Title of host publication | Document Analysis and Recognition – ICDAR 2023 - 17th International Conference San José, CA, USA, August 21–26, 2023 Proceedings, Part II |
Editors | Gernot A. Fink, Rajiv Jain, Koichi Kise, Richard Zanibbi |
Place of Publication | Cham Switzerland |
Publisher | Springer |
Pages | 130-146 |
Number of pages | 17 |
ISBN (Electronic) | 9783031416798 |
ISBN (Print) | 9783031416781 |
DOIs | |
Publication status | Published - 2023 |
Event | International Conference on Document Analysis and Recognition (ICDAR) 2023 - San Jose, United States of America Duration: 21 Aug 2023 → 26 Aug 2023 Conference number: 17th https://link.springer.com/book/10.1007/978-3-031-41679-8 (Proceedings - 2) https://icdar2023.org/ (Website) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 14188 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | International Conference on Document Analysis and Recognition (ICDAR) 2023 |
---|---|
Abbreviated title | ICDAR 2023 |
Country/Territory | United States of America |
City | San Jose |
Period | 21/08/23 → 26/08/23 |
Internet address |
|
Keywords
- Deep Learning
- Document Layout Analysis
- Graph Neural Network
- Language Independent