Hierarchical User Intent Graph Network for multimedia recommendation

Yinwei Wei, Xiang Wang, Xiangnan He, Liqiang Nie, Yong Rui, Tat-Seng Chua

Research output: Contribution to journalArticleResearchpeer-review

35 Citations (Scopus)


Understanding user preference on item context is the key to acquire a high-quality multimedia recommendation. Typically, the pre-existing features of items are derived from pre-trained models (e.g. visual features of micro-videos extracted from some neural networks), and then introduced into the recommendation framework (e.g. collaborative filtering) to capture user preference. However, we argue that such a paradigm is insufficient to output satisfactory user representations, which hardly profile personal interests well. The key reason is that present works largely leave user intents untouched, then failing to encode such informative representation of users. In this work, we aim to learn multi-level user intents from the co-interacted patterns of items, so as to obtain high-quality representations of users and items and further enhance the recommendation performance. Towards this end, we develop a novel framework, Hierarchical User Intent Graph Network, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse-grained intents. In particular, we get the multi-level user intents by recursively performing two operations: 1) intra-level aggregation, which distills the signal pertinent to user intents from co-interacted item graphs; and 2) inter-level aggregation, which constitutes the supernode in higher levels to model coarser-grained user intents via gathering the nodes' representations in the lower ones. Then, we refine the user and item representations as a distribution over the discovered intents, instead of simple pre-existing features. To demonstrate the effectiveness of our model, we conducted extensive experiments on three public datasets. Our model achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN. Furthermore, by visualizing the item representations, we provide the semantics of user intents.

Original languageEnglish
Pages (from-to)2701-2712
Number of pages12
JournalIEEE Transactions on Multimedia
Publication statusPublished - 11 Jun 2022
Externally publishedYes


  • Graph convolution network
  • hierarchical graph structure
  • multimedia recommendation
  • user intention modeling

Cite this