Keypoint based weakly supervised human parsing

Zhonghua Wu, Guosheng Lin, Jianfei Cai

Research output: Contribution to journalArticleResearchpeer-review

8 Citations (Scopus)


Fully convolutional networks (FCN) have achieved great success in human parsing in recent years. In conventional human parsing tasks, pixel-level labeling is required for guiding the training, which usually involves enormous human labeling efforts. To ease the labeling efforts, we propose a novel weakly supervised human parsing method which only requires simple object keypoint annotations for learning. We develop an iterative learning method to generate pseudo part segmentation masks from keypoint labels. With these pseudo masks, we train a FCN network to output pixel-level human parsing predictions. Furthermore, we develop a correlation network to perform joint prediction of part and object segmentation masks and improve the segmentation performance. The experiment results show that our weakly supervised method is able to achieve very competitive human parsing results. Despite that our method only uses simple keypoint annotations for learning, we are able to achieve comparable performance with fully supervised methods which use the expensive pixel-level annotations.

Original languageEnglish
Article number0103801
Number of pages10
JournalImage and Vision Computing
Publication statusPublished - Nov 2019


  • Correlation network
  • Human parsing
  • Iterative refinement
  • Keypoint
  • Skeleton
  • Weakly supervise

Cite this