Datasets for the evaluation of substitution-tolerant subgraph isomorphism

Pierre Héroux, Pierre Le Bodic, Sébastien Adam

Research output: Chapter in Book/Report/Conference proceedingChapter (Book)Researchpeer-review

2 Citations (Scopus)


Due to their representative power, structural descriptions have gained a great interest in the community working on graphics recognition. Indeed, graph based representations have successful been used for isolated symbol recognition. New challenges in this research field have focused on symbol recognition, symbol spotting or symbol based indexing of technical drawing. When they are based on structural descriptions, these tasks can be expressed by means of a subgraph isomorphism search. Indeed, it consists in locating the instance of a pattern graph representing a symbol in a target graph representing the whole document image. However, there is a lack of publicly available datasets allowing to evaluate the performance of subgraph isomorphism approaches in presence of noisy data. In this paper, we present five datasets that can be used to evaluate the performance of algorithms on several tasks involving subgraph isomorphism. Four of these datasets have been synthetically generated and allow to evaluate the search of a single instance of the pattern with or without perturbed labels. The fifth dataset corresponds to the structural description of architectural plans and allows to evaluate the search of multiple occurrences of the pattern. These datasets are made available for download. We also propose several measures to qualify each of the tasks.

Original languageEnglish
Title of host publicationGraphics Recognition Current Trends and Challenges
Subtitle of host publication10th International Workshop, GREC 2013, Bethlehem, PA, USA, August 20-21, 2013, Revised Selected Papers
EditorsBart Lamiroy, Jean-Marc Ogier
Place of PublicationBerlin Germany
Number of pages12
ISBN (Electronic)9783662448540
ISBN (Print)9783662448533
Publication statusPublished - 2014
Externally publishedYes

Publication series

NameLecture notes in computer science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Cite this