MetaGraph2Vec: complex semantic path augmented heterogeneous network embedding

Daokun Zhang, Jie Yin, Xingquan Zhu, Chengqi Zhang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

107 Citations (Scopus)

Abstract

Network embedding in heterogeneous information networks (HINs) is a challenging task, due to complications of different node types and rich relationships between nodes. As a result, conventional network embedding techniques cannot work on such HINs. Recently, metapath-based approaches have been proposed to characterize relationships in HINs, but they are ineffective in capturing rich contexts and semantics between nodes for embedding learning, mainly because (1) metapath is a rather strict single path node-node relationship descriptor, which is unable to accommodate variance in relationships, and (2) only a small portion of paths can match the metapath, resulting in sparse context information for embedding learning. In this paper, we advocate a new metagraph concept to capture richer structural contexts and semantics between distant nodes. A metagraph contains multiple paths between nodes, each describing one type of relationships, so the augmentation of multiple metapaths provides an effective way to capture rich contexts and semantic relations between nodes. This greatly boosts the ability of metapath-based embedding techniques in handling very sparse HINs. We propose a new embedding learning algorithm, namely MetaGraph2Vec, which uses metagraph to guide the generation of random walks and to learn latent embeddings of multi-typed HIN nodes. Experimental results show that MetaGraph2Vec is able to outperform the state-of-the-art baselines in various heterogeneous network mining tasks such as node classification, node clustering, and similarity search.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part II
EditorsDinh Phung, Vincent S. Tseng, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi
Place of PublicationCham Switzerland
PublisherSpringer
Pages196-208
Number of pages13
ISBN (Electronic)9783319930374
ISBN (Print)9783319930367
DOIs
Publication statusPublished - 2018
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2018 - Grand Hyatt, Melbourne, Australia
Duration: 3 Jun 20186 Jun 2018
Conference number: 22nd
http://pakdd2018.medmeeting.org/Content/92892
https://link.springer.com/book/10.1007/978-3-319-93034-3 (Proceedings)

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume10938
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2018
Abbreviated titlePAKDD 2018
Country/TerritoryAustralia
CityMelbourne
Period3/06/186/06/18
Internet address

Cite this