Splitting Vs. Merging: mining object regions with discrepancy and intersection loss for weakly supervised semantic segmentation

Tianyi Zhang, Guosheng Lin, Weide Liu, Jianfei Cai, Alex Kot

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

18 Citations (Scopus)


In this paper we focus on the task of weakly-supervised semantic segmentation supervised with image-level labels. Since the pixel-level annotation is not available in the training process, we rely on region mining models to estimate the pseudo-masks from the image-level labels. Thus, in order to improve the final segmentation results, we aim to train a region-mining model which could accurately and completely highlight the target object regions for generating high-quality pseudo-masks. However, the region mining models are likely to only highlight the most discriminative regions instead of the entire objects. In this paper, we aim to tackle this problem from a novel perspective of optimization process. We propose a Splitting vs. Merging optimization strategy, which is mainly composed of the Discrepancy loss and the Intersection loss. The proposed Discrepancy loss aims at mining out regions of different spatial patterns instead of only the most discriminative region, which leads to the splitting effect. The Intersection loss aims at mining the common regions of the different maps, which leads to the merging effect. Our Splitting vs. Merging strategy helps to expand the output heatmap of the region mining model to the object scale. Finally, by training the segmentation model with the masks generated by our Splitting vs Merging strategy, we achieve the state-of-the-art weakly-supervised segmentation results on the Pascal VOC 2012 benchmark.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020
Subtitle of host publication16th European Conference Glasgow, UK, August 23–28, 2020 Proceedings, Part XXI
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
Place of PublicationCham Switzerland
Number of pages17
ISBN (Electronic)9783030585426
ISBN (Print)9783030585419
Publication statusPublished - 2020
EventEuropean Conference on Computer Vision 2020 - Glasgow, United Kingdom
Duration: 23 Aug 202028 Aug 2020
Conference number: 16th
https://link.springer.com/book/10.1007/978-3-030-58452-8 (Proceedings)
https://eccv2020.eu (Website)

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceEuropean Conference on Computer Vision 2020
Abbreviated titleECCV 2020
Country/TerritoryUnited Kingdom
Internet address


  • Deep Convolutional Neural Network (DCNN)
  • Semantic segmentation
  • Weakly-supervised learning

Cite this