Contextual ensemble network for semantic segmentation

Quan Zhou, Xiaofu Wu, Suofei Zhang, Bin Kang, Zongyuan Ge, Longin Jan Latecki

Research output: Contribution to journalArticleResearchpeer-review

25 Citations (Scopus)


Recently, exploring features from different layers in fully convolutional networks (FCNs) has gained substantial attention to capture context information for semantic segmentation. This paper presents a novel encoder-decoder architecture, called contextual ensemble network (CENet), for semantic segmentation, where the contextual cues are aggregated via densely usampling the convolutional features of deep layer to the shallow deconvolutional layers. The proposed CENet is trained in terms of end-to-end segmentation to match the resolution of input image, and allows us to fully explore contextual features through ensemble of dense deconvolutions. We evaluate our CENet on two widely-used semantic segmentation datasets: PASCAL VOC 2012 and CityScapes. The experimental results demonstrate our CENet achieves superior performance with respect to recent state-of-the-art results. Furthermore, we also evaluate CENet on MS COCO dataset and ISBI 2012 dataset for the task of instance segmentation and biological segmentation, respectively. The experimental results show that CENet obtains promising results on these two datasets.

Original languageEnglish
Article number108290
Number of pages11
JournalPattern Recognition
Publication statusPublished - Feb 2022


  • Context aggregation
  • Encoder-decoder networks
  • Ensemble deconvolution
  • FCNs
  • Semantic segmentation

Cite this