Abstract
In this paper, the research looks at improving clothing parsing using superpixels features extractor network (SP-FEN). Clothing parsing using a fully convolutional network has two parts: an encoder and decoder. The encoder lowers the dimensionality and produces a low-resolution prediction, while the decoder tries to upscale the prediction and returns it to the size of the input image. Typically, fine-grained details get lost in the encoding part of the model is not recovered well in the decoder part. To fix this issue, skip connections are typically used in recovering and adding more fine-grained details to the final prediction. A new method is proposed to introduce superpixels features to the decoder by adding a side network (SP-FEN) that extracts features from superpixels representation of the input image using the SLIC Algorithm. SP-FEN then produces a meaningful superpixels features to be injected into the decoder. The SP-FEN is learning to choose specific features to be fed to the decoder part to boost the outputs overall quality. The proposed method has shown to enhance the MIoU accuracy using the refined Fashionista V1.0 dataset and CFPD dataset. The results showed that the proposed approach achieved superior performance with pixel-wise segmentation and clothing parsing.
Original language | English |
---|---|
Pages (from-to) | 2245-2263 |
Number of pages | 19 |
Journal | Neural Processing Letters |
Volume | 51 |
Issue number | 3 |
DOIs | |
Publication status | Published - Jun 2020 |
Externally published | Yes |
Keywords
- Clothing parsing
- Deep convolutional networks
- Scene parsing
- Semantic segmentation
- Superpixels