TY - JOUR
T1 - Search efficient binary network embedding
AU - Zhang, Daokun
AU - Yin, Jie
AU - Zhu, Xingquan
AU - Zhang, Chengqi
N1 - Funding Information:
This work is supported by a joint CRP research fund between the University of Sydney and Data61, CSIRO, the US National Science Foundation (NSF) through grants IIS-1763452 and CNS-1828181, and the Australian Research Council (ARC) through grants LP160100630 and DP180100966. Authors’ addresses: D. Zhang and J. Yin, Discipline of Business Analytics, The University of Sydney, City Road, Camperdown, NSW 2006, Australia; emails: {daokun.zhang, jie.yin}@sydney.edu.au; X. Zhu, Department of Computer & Electrical Engineering and Computer Science, Florida Atlantic University, Glades Road, Boca Raton, FL 33431; email: [email protected]; C. Zhang, Australian Artificial Intelligence Institute, Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, Ultimo, NSW 2007, Australia; email: [email protected]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. © 2021 Association for Computing Machinery. 1556-4681/2021/05-ART61 $15.00 https://doi.org/10.1145/3436892
Publisher Copyright:
© 2021 ACM.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/6
Y1 - 2021/6
N2 - Traditional network embedding primarily focuses on learning a continuous vector representation for each node, preserving network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily applied to the vector-format node representations for network analysis. However, the learned continuous vector representations are inefficient for large-scale similarity search, which often involves finding nearest neighbors measured by distance or similarity in a continuous vector space. In this article, we propose a search efficient binary network embedding algorithm called BinaryNE to learn a binary code for each node, by simultaneously modeling node context relations and node attribute relations through a three-layer neural network. BinaryNE learns binary node representations using a stochastic gradient descent-based online learning algorithm. The learned binary encoding not only reduces memory usage to represent each node, but also allows fast bit-wise comparisons to support faster node similarity search than using Euclidean or other distance measures. Extensive experiments and comparisons demonstrate that BinaryNE not only delivers more than 25 times faster search speed, but also provides comparable or better search quality than traditional continuous vector based network embedding methods. The binary codes learned by BinaryNE also render competitive performance on node classification and node clustering tasks. The source code of the BinaryNE algorithm is available at https://github.com/daokunzhang/BinaryNE.
AB - Traditional network embedding primarily focuses on learning a continuous vector representation for each node, preserving network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily applied to the vector-format node representations for network analysis. However, the learned continuous vector representations are inefficient for large-scale similarity search, which often involves finding nearest neighbors measured by distance or similarity in a continuous vector space. In this article, we propose a search efficient binary network embedding algorithm called BinaryNE to learn a binary code for each node, by simultaneously modeling node context relations and node attribute relations through a three-layer neural network. BinaryNE learns binary node representations using a stochastic gradient descent-based online learning algorithm. The learned binary encoding not only reduces memory usage to represent each node, but also allows fast bit-wise comparisons to support faster node similarity search than using Euclidean or other distance measures. Extensive experiments and comparisons demonstrate that BinaryNE not only delivers more than 25 times faster search speed, but also provides comparable or better search quality than traditional continuous vector based network embedding methods. The binary codes learned by BinaryNE also render competitive performance on node classification and node clustering tasks. The source code of the BinaryNE algorithm is available at https://github.com/daokunzhang/BinaryNE.
KW - binary coding
KW - efficiency
KW - Network embedding
KW - similarity search
UR - http://www.scopus.com/inward/record.url?scp=85108408379&partnerID=8YFLogxK
U2 - 10.1145/3436892
DO - 10.1145/3436892
M3 - Article
AN - SCOPUS:85108408379
SN - 1556-4681
VL - 15
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 4
M1 - 3436892
ER -