Inductive Representation Learning on Large Graphs

By William L. Hamilton et al
Read the original document by opening this link in a new tab.

Table of Contents

Abstract
1 Introduction
Low-dimensional vector embeddings of nodes in large graphs1have proved extremely useful as feature inputs for a wide variety of prediction and graph analysis tasks
22 Related work
Our algorithm is conceptually related to previous node embedding approaches, general supervised approaches to learning over graphs, and recent advancements in applying convolutional neural networks to graph-structured data
3 Proposed method: GraphSAGE
The key idea behind our approach is that we learn how to aggregate feature information from a node’s local neighborhood (e.g., the degrees or text attributes of nearby nodes)
3.1 Embedding generation (i.e., forward propagation) algorithm
In this section, we describe the embedding generation, or forward propagation algorithm (Algorithm 1), which assumes that the model has already been trained and that the parameters are fixed
3.2 Learning the parameters of GraphSAGE
In order to learn useful, predictive representations in a fully unsupervised setting, we apply a graph-based loss function to the output representations, zu;8u2V, and tune the weight matrices, Wk;8k2f1;:::;Kg, and parameters of the aggregator functions via stochastic gradient descent
3.3 Aggregator Architectures
Unlike machine learning over N-D lattices (e.g., sentences, images, or 3-D volumes), a node’s neighbors have no natural ordering; thus, the aggregator functions in Algorithm 1 must operate over an unordered set of vectors

Summary

Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. GraphSAGE is presented as a general inductive framework that leverages node feature information to efficiently generate node embeddings for previously unseen data. The algorithm outperforms strong baselines on three inductive node-classification benchmarks. The inductive node embedding problem is challenging, and an inductive framework must learn to recognize structural properties of a node’s neighborhood to generalize to unseen nodes. The GraphSAGE algorithm is inspired by classic algorithms for testing graph isomorphism and learns how to aggregate feature information from a node’s local neighborhood. Different aggregator architectures are explored, including mean aggregator, LSTM aggregator, and pooling aggregator. The proposed method demonstrates high performance in generating representations for individual nodes in graphs.
×
This is where the content will go.