SIR-GN: A Fast Structural Iterative Representation Learning Approach for Graph Nodes

Document Type

Article

Publication Date

6-2021

Abstract

Graph representation learning methods have attracted an increasing amount of attention in recent years. These methods focus on learning a numerical representation of the nodes in a graph. Learning these representations is a powerful instrument for tasks such as graph mining, visualization, and hashing. They are of particular interest because they facilitate the direct use of standard machine learning models on graphs. Graph representation learning methods can be divided into two main categories: methods preserving the connectivity information of the nodes and methods preserving nodes’ structural information. Connectivity-based methods focus on encoding relationships between nodes, with connected nodes being closer together in the resulting latent space. While methods preserving structure generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving node connectivity, only a few works focus on preserving nodes’ structure. Properly encoding nodes’ structural information is fundamental for many real-world applications as it has been demonstrated that this information can be leveraged to successfully solve many tasks where connectivity-based methods usually fail. A typical example is the task of node classification, i.e., the assignment or prediction of a particular label for a node. Current limitations of structural representation methods are their scalability, representation meaning, and no formal proof that guaranteed the preservation of structural properties. We propose a new graph representation learning method, called Structural Iterative Representation learning approach for Graph Nodes (SIR-GN). In this work, we propose two variations (SIR-GN: GMM and SIR-GN: K-Means) and show how our best variation SIR-GN: K-Means: (1) theoretically guarantees the preservation of graph structural similarities, (2) provides a clear meaning about its representation and a way to interpret it with a specifically designed attribution procedure, and (3) is scalable and fast to compute. In addition, from our experiment, we show that SIR-GN: K-Means is often better or, in the worst-case comparable than the existing structural graph representation learning methods present in the literature. Also, we empirically show its superior scalability and computational performance when compared to other existing approaches.

Share

COinS