TY - JOUR
T1 - Modality-oriented graph learning toward outfit compatibility modeling
AU - Song, Xuemeng
AU - Fang, Shi-Ting
AU - Chen, Xiaolin
AU - Wei, Yinwei
AU - Zhao, Zhongzhou
AU - Nie, Liqiang
N1 - Funding Information:
This work was supported in part by the Shandong Provincial Natural Science Foundation under Grant ZR2019JQ23, in part by the Key Research and Development Program of Shandong (Major scientific and technological innovation projects) under Grant 2020CXGC010111, in part by the National Natural Science Foundation of China under Grant U1936203, in part by the Young Creative Team in Universities of Shandong Province under Grant 2020KJN012, and in part by Alibaba Group through Alibaba Innovative Research Program.
Publisher Copyright:
© 1999-2012 IEEE.
PY - 2023/12/9
Y1 - 2023/12/9
N2 - Outfit compatibility modeling, which aims to automatically evaluate the matching degree of an outfit, has drawn great research attention. Regarding the comprehensive evaluation, several previous studies have attempted to solve the task of outfit compatibility modeling by integrating the multi-modal information of fashion items. However, these methods primarily focus on fusing the visual and textual modalities, but seldom consider the category modality as an essential modality. In addition, they mainly focus on the exploration of the intra-modal compatibility relation among fashion items in an outfit but ignore the importance of the inter-modal compatibility relation, i.e., the compatibility across different modalities between fashion items. Since each modality of the item could deliver the same characteristics of the item as other modalities, as well as certain exclusive features of the item, overlooking the inter-modal compatibility could yield sub-optimal performance. To address these issues, a multi-modal outfit compatibility modeling scheme with modality-oriented graph learning is proposed, dubbed as MOCM-MGL, which takes both the visual, textual, and category modalities as input and jointly propagates the intra-modal and inter-modal compatibilities among fashion items. Experimental results on the real-world Polyvore Outfits-ND and Polyvore Outfits-D datasets have demonstrated the superiority of our proposed model over existing methods.
AB - Outfit compatibility modeling, which aims to automatically evaluate the matching degree of an outfit, has drawn great research attention. Regarding the comprehensive evaluation, several previous studies have attempted to solve the task of outfit compatibility modeling by integrating the multi-modal information of fashion items. However, these methods primarily focus on fusing the visual and textual modalities, but seldom consider the category modality as an essential modality. In addition, they mainly focus on the exploration of the intra-modal compatibility relation among fashion items in an outfit but ignore the importance of the inter-modal compatibility relation, i.e., the compatibility across different modalities between fashion items. Since each modality of the item could deliver the same characteristics of the item as other modalities, as well as certain exclusive features of the item, overlooking the inter-modal compatibility could yield sub-optimal performance. To address these issues, a multi-modal outfit compatibility modeling scheme with modality-oriented graph learning is proposed, dubbed as MOCM-MGL, which takes both the visual, textual, and category modalities as input and jointly propagates the intra-modal and inter-modal compatibilities among fashion items. Experimental results on the real-world Polyvore Outfits-ND and Polyvore Outfits-D datasets have demonstrated the superiority of our proposed model over existing methods.
KW - Graph convolutional network
KW - multi-modal recommendation
KW - outfit compatibility modeling
UR - http://www.scopus.com/inward/record.url?scp=85121366043&partnerID=8YFLogxK
U2 - 10.1109/TMM.2021.3134164
DO - 10.1109/TMM.2021.3134164
M3 - Article
AN - SCOPUS:85121366043
SN - 1520-9210
VL - 25
SP - 856
EP - 867
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -