Convolutional Layers
Many different types of graphs convolutional layers have been proposed in the literature. Choosing the right layer for your application could involve a lot of exploration. Some of the most commonly used layers are the GCNConv
and the GATv2Conv
. Multiple graph convolutional layers are typically stacked together to create a graph neural network model (see GNNChain
).
The table below lists all graph convolutional layers implemented in the GraphNeuralNetworks.jl. It also highlights the presence of some additional capabilities with respect to basic message passing:
- Sparse Ops: implements message passing as multiplication by sparse adjacency matrix instead of the gather/scatter mechanism. This can lead to better CPU performances but it is not supported on GPU yet.
- Edge Weight: supports scalar weights (or equivalently scalar features) on edges.
- Edge Features: supports feature vectors on edges.
- Heterograph: supports heterogeneous graphs (see
GNNHeteroGraph
). - TemporalSnapshotsGNNGraphs: supports temporal graphs (see
TemporalSnapshotsGNNGraph
) by applying the convolution layers to each snapshot independently.
Layer | Sparse Ops | Edge Weight | Edge Features | Heterograph | TemporalSnapshotsGNNGraphs |
---|---|---|---|---|---|
AGNNConv | ✓ | ||||
CGConv | ✓ | ✓ | ✓ | ||
ChebConv | ✓ | ||||
EGNNConv | ✓ | ||||
EdgeConv | ✓ | ||||
GATConv | ✓ | ✓ | ✓ | ||
GATv2Conv | ✓ | ✓ | ✓ | ||
GatedGraphConv | ✓ | ✓ | |||
GCNConv | ✓ | ✓ | ✓ | ||
GINConv | ✓ | ✓ | ✓ | ||
GMMConv | ✓ | ||||
GraphConv | ✓ | ✓ | ✓ | ||
MEGNetConv | ✓ | ||||
NNConv | ✓ | ||||
ResGatedGraphConv | ✓ | ✓ | |||
SAGEConv | ✓ | ✓ | ✓ | ||
SGConv | ✓ | ✓ | |||
TransformerConv | ✓ |
Docs
GraphNeuralNetworks.AGNNConv
— TypeAGNNConv(; init_beta=1.0f0, trainable=true, add_self_loops=true)
Attention-based Graph Neural Network layer from paper Attention-based Graph Neural Network for Semi-Supervised Learning.
The forward pass is given by
\[\mathbf{x}_i' = \sum_{j \in N(i)} \alpha_{ij} \mathbf{x}_j\]
where the attention coefficients $\alpha_{ij}$ are given by
\[\alpha_{ij} =\frac{e^{\beta \cos(\mathbf{x}_i, \mathbf{x}_j)}} {\sum_{j'}e^{\beta \cos(\mathbf{x}_i, \mathbf{x}_{j'})}}\]
with the cosine distance defined by
\[\cos(\mathbf{x}_i, \mathbf{x}_j) = \frac{\mathbf{x}_i \cdot \mathbf{x}_j}{\lVert\mathbf{x}_i\rVert \lVert\mathbf{x}_j\rVert}\]
and $\beta$ a trainable parameter if trainable=true
.
Arguments
init_beta
: The initial value of $\beta$. Default 1.0f0.trainable
: If true, $\beta$ is trainable. Defaulttrue
.add_self_loops
: Add self loops to the graph before performing the convolution. Defaulttrue
.
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s, t)
# create layer
l = AGNNConv(init_beta=2.0f0)
# forward pass
y = l(g, x)
GraphNeuralNetworks.CGConv
— TypeCGConv((in, ein) => out, act=identity; bias=true, init=glorot_uniform, residual=false)
CGConv(in => out, ...)
The crystal graph convolutional layer from the paper Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. Performs the operation
\[\mathbf{x}_i' = \mathbf{x}_i + \sum_{j\in N(i)}\sigma(W_f \mathbf{z}_{ij} + \mathbf{b}_f)\, act(W_s \mathbf{z}_{ij} + \mathbf{b}_s)\]
where $\mathbf{z}_{ij}$ is the node and edge features concatenation $[\mathbf{x}_i; \mathbf{x}_j; \mathbf{e}_{j\to i}]$ and $\sigma$ is the sigmoid function. The residual $\mathbf{x}_i$ is added only if residual=true
and the output size is the same as the input size.
Arguments
in
: The dimension of input node features.ein
: The dimension of input edge features.
If ein
is not given, assumes that no edge features are passed as input in the forward pass.
out
: The dimension of output node features.act
: Activation function.bias
: Add learnable bias.init
: Weights' initializer.residual
: Add a residual connection.
Examples
g = rand_graph(5, 6)
x = rand(Float32, 2, g.num_nodes)
e = rand(Float32, 3, g.num_edges)
l = CGConv((2, 3) => 4, tanh)
y = l(g, x, e) # size: (4, num_nodes)
# No edge features
l = CGConv(2 => 4, tanh)
y = l(g, x) # size: (4, num_nodes)
GraphNeuralNetworks.ChebConv
— TypeChebConv(in => out, k; bias=true, init=glorot_uniform)
Chebyshev spectral graph convolutional layer from paper Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.
Implements
\[X' = \sum^{K-1}_{k=0} W^{(k)} Z^{(k)}\]
where $Z^{(k)}$ is the $k$-th term of Chebyshev polynomials, and can be calculated by the following recursive form:
\[\begin{aligned} Z^{(0)} &= X \\ Z^{(1)} &= \hat{L} X \\ Z^{(k)} &= 2 \hat{L} Z^{(k-1)} - Z^{(k-2)} \end{aligned}\]
with $\hat{L}$ the scaled_laplacian
.
Arguments
in
: The dimension of input features.out
: The dimension of output features.k
: The order of Chebyshev polynomial.bias
: Add learnable bias.init
: Weights' initializer.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = ChebConv(3 => 5, 5)
# forward pass
y = l(g, x) # size: 5 × num_nodes
GraphNeuralNetworks.DConv
— TypeDConv(ch::Pair{Int, Int}, k::Int; init = glorot_uniform, bias = true)
Diffusion convolution layer from the paper Diffusion Convolutional Recurrent Neural Networks: Data-Driven Traffic Forecasting.
Arguments
ch
: Pair of input and output dimensions.k
: Number of diffusion steps.init
: Weights' initializer. Defaultglorot_uniform
.bias
: Add learnable bias. Defaulttrue
.
Examples
julia> g = GNNGraph(rand(10, 10), ndata = rand(Float32, 2, 10));
julia> dconv = DConv(2 => 4, 4)
DConv(2 => 4, 4)
julia> y = dconv(g, g.ndata.x);
julia> size(y)
(4, 10)
GraphNeuralNetworks.EGNNConv
— TypeEGNNConv((in, ein) => out; hidden_size=2in, residual=false)
EGNNConv(in => out; hidden_size=2in, residual=false)
Equivariant Graph Convolutional Layer from E(n) Equivariant Graph Neural Networks.
The layer performs the following operation:
\[\begin{aligned} \mathbf{m}_{j\to i} &=\phi_e(\mathbf{h}_i, \mathbf{h}_j, \lVert\mathbf{x}_i-\mathbf{x}_j\rVert^2, \mathbf{e}_{j\to i}),\\ \mathbf{x}_i' &= \mathbf{x}_i + C_i\sum_{j\in\mathcal{N}(i)}(\mathbf{x}_i-\mathbf{x}_j)\phi_x(\mathbf{m}_{j\to i}),\\ \mathbf{m}_i &= C_i\sum_{j\in\mathcal{N}(i)} \mathbf{m}_{j\to i},\\ \mathbf{h}_i' &= \mathbf{h}_i + \phi_h(\mathbf{h}_i, \mathbf{m}_i) \end{aligned}\]
where $\mathbf{h}_i$, $\mathbf{x}_i$, $\mathbf{e}_{j\to i}$ are invariant node features, equivariant node features, and edge features respectively. $\phi_e$, $\phi_h$, and $\phi_x$ are two-layer MLPs. C
is a constant for normalization, computed as $1/|\mathcal{N}(i)|$.
Constructor Arguments
in
: Number of input features forh
.out
: Number of output features forh
.ein
: Number of input edge features.hidden_size
: Hidden representation size.residual
: Iftrue
, add a residual connection. Only possible ifin == out
. Defaultfalse
.
Forward Pass
l(g, x, h, e=nothing)
Forward Pass Arguments:
g
: The graph.x
: Matrix of equivariant node coordinates.h
: Matrix of invariant node features.e
: Matrix of invariant edge features. Defaultnothing
.
Returns updated h
and x
.
Examples
g = rand_graph(10, 10)
h = randn(Float32, 5, g.num_nodes)
x = randn(Float32, 3, g.num_nodes)
egnn = EGNNConv(5 => 6, 10)
hnew, xnew = egnn(g, h, x)
GraphNeuralNetworks.EdgeConv
— TypeEdgeConv(nn; aggr=max)
Edge convolutional layer from paper Dynamic Graph CNN for Learning on Point Clouds.
Performs the operation
\[\mathbf{x}_i' = \square_{j \in N(i)}\, nn([\mathbf{x}_i; \mathbf{x}_j - \mathbf{x}_i])\]
where nn
generally denotes a learnable function, e.g. a linear layer or a multi-layer perceptron.
Arguments
nn
: A (possibly learnable) function.aggr
: Aggregation operator for the incoming messages (e.g.+
,*
,max
,min
, andmean
).
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
# create layer
l = EdgeConv(Dense(2 * in_channel, out_channel), aggr = +)
# forward pass
y = l(g, x)
GraphNeuralNetworks.GATConv
— TypeGATConv(in => out, [σ; heads, concat, init, bias, negative_slope, add_self_loops])
GATConv((in, ein) => out, ...)
Graph attentional layer from the paper Graph Attention Networks.
Implements the operation
\[\mathbf{x}_i' = \sum_{j \in N(i) \cup \{i\}} \alpha_{ij} W \mathbf{x}_j\]
where the attention coefficients $\alpha_{ij}$ are given by
\[\alpha_{ij} = \frac{1}{z_i} \exp(LeakyReLU(\mathbf{a}^T [W \mathbf{x}_i; W \mathbf{x}_j]))\]
with $z_i$ a normalization factor.
In case ein > 0
is given, edge features of dimension ein
will be expected in the forward pass and the attention coefficients will be calculated as
\[\alpha_{ij} = \frac{1}{z_i} \exp(LeakyReLU(\mathbf{a}^T [W_e \mathbf{e}_{j\to i}; W \mathbf{x}_i; W \mathbf{x}_j]))\]
Arguments
in
: The dimension of input node features.ein
: The dimension of input edge features. Default 0 (i.e. no edge features passed in the forward).out
: The dimension of output node features.σ
: Activation function. Defaultidentity
.bias
: Learn the additive bias if true. Defaulttrue
.heads
: Number attention heads. Default1
.concat
: Concatenate layer output or not. If not, layer output is averaged over the heads. Defaulttrue
.negative_slope
: The parameter of LeakyReLU.Default0.2
.add_self_loops
: Add self loops to the graph before performing the convolution. Defaulttrue
.dropout
: Dropout probability on the normalized attention coefficient. Default0.0
.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = GATConv(in_channel => out_channel, add_self_loops = false, bias = false; heads=2, concat=true)
# forward pass
y = l(g, x)
GraphNeuralNetworks.GATv2Conv
— TypeGATv2Conv(in => out, [σ; heads, concat, init, bias, negative_slope, add_self_loops])
GATv2Conv((in, ein) => out, ...)
GATv2 attentional layer from the paper How Attentive are Graph Attention Networks?.
Implements the operation
\[\mathbf{x}_i' = \sum_{j \in N(i) \cup \{i\}} \alpha_{ij} W_1 \mathbf{x}_j\]
where the attention coefficients $\alpha_{ij}$ are given by
\[\alpha_{ij} = \frac{1}{z_i} \exp(\mathbf{a}^T LeakyReLU(W_2 \mathbf{x}_i + W_1 \mathbf{x}_j))\]
with $z_i$ a normalization factor.
In case ein > 0
is given, edge features of dimension ein
will be expected in the forward pass and the attention coefficients will be calculated as
\[\alpha_{ij} = \frac{1}{z_i} \exp(\mathbf{a}^T LeakyReLU(W_3 \mathbf{e}_{j\to i} + W_2 \mathbf{x}_i + W_1 \mathbf{x}_j)).\]
Arguments
in
: The dimension of input node features.ein
: The dimension of input edge features. Default 0 (i.e. no edge features passed in the forward).out
: The dimension of output node features.σ
: Activation function. Defaultidentity
.bias
: Learn the additive bias if true. Defaulttrue
.heads
: Number attention heads. Default1
.concat
: Concatenate layer output or not. If not, layer output is averaged over the heads. Defaulttrue
.negative_slope
: The parameter of LeakyReLU.Default0.2
.add_self_loops
: Add self loops to the graph before performing the convolution. Defaulttrue
.dropout
: Dropout probability on the normalized attention coefficient. Default0.0
.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
ein = 3
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = GATv2Conv((in_channel, ein) => out_channel, add_self_loops = false)
# edge features
e = randn(Float32, ein, length(s))
# forward pass
y = l(g, x, e)
GraphNeuralNetworks.GCNConv
— TypeGCNConv(in => out, σ=identity; [bias, init, add_self_loops, use_edge_weight])
Graph convolutional layer from paper Semi-supervised Classification with Graph Convolutional Networks.
Performs the operation
\[\mathbf{x}'_i = \sum_{j\in N(i)} a_{ij} W \mathbf{x}_j\]
where $a_{ij} = 1 / \sqrt{|N(i)||N(j)|}$ is a normalization factor computed from the node degrees.
If the input graph has weighted edges and use_edge_weight=true
, than $a_{ij}$ will be computed as
\[a_{ij} = \frac{e_{j\to i}}{\sqrt{\sum_{j \in N(i)} e_{j\to i}} \sqrt{\sum_{i \in N(j)} e_{i\to j}}}\]
The input to the layer is a node feature array X
of size (num_features, num_nodes)
and optionally an edge weight vector.
Arguments
in
: Number of input features.out
: Number of output features.σ
: Activation function. Defaultidentity
.bias
: Add learnable bias. Defaulttrue
.init
: Weights' initializer. Defaultglorot_uniform
.add_self_loops
: Add self loops to the graph before performing the convolution. Defaultfalse
.use_edge_weight
: Iftrue
, consider the edge weights in the input graph (if available). Ifadd_self_loops=true
the new weights will be set to 1. This option is ignored if theedge_weight
is explicitly provided in the forward pass. Defaultfalse
.
Forward
(::GCNConv)(g::GNNGraph, x, edge_weight = nothing; norm_fn = d -> 1 ./ sqrt.(d), conv_weight = nothing) -> AbstractMatrix
Takes as input a graph g
, a node feature matrix x
of size [in, num_nodes]
, and optionally an edge weight vector. Returns a node feature matrix of size [out, num_nodes]
.
The norm_fn
parameter allows for custom normalization of the graph convolution operation by passing a function as argument. By default, it computes $\frac{1}{\sqrt{d}}$ i.e the inverse square root of the degree (d
) of each node in the graph. If conv_weight
is an AbstractMatrix
of size [out, in]
, then the convolution is performed using that weight matrix instead of the weights stored in the model.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = GCNConv(3 => 5)
# forward pass
y = l(g, x) # size: 5 × num_nodes
# convolution with edge weights and custom normalization function
w = [1.1, 0.1, 2.3, 0.5]
custom_norm_fn(d) = 1 ./ sqrt.(d + 1) # Custom normalization function
y = l(g, x, w; norm_fn = custom_norm_fn)
# Edge weights can also be embedded in the graph.
g = GNNGraph(s, t, w)
l = GCNConv(3 => 5, use_edge_weight=true)
y = l(g, x) # same as l(g, x, w)
GraphNeuralNetworks.GINConv
— TypeGINConv(f, ϵ; aggr=+)
Graph Isomorphism convolutional layer from paper How Powerful are Graph Neural Networks?.
Implements the graph convolution
\[\mathbf{x}_i' = f_\Theta\left((1 + \epsilon) \mathbf{x}_i + \sum_{j \in N(i)} \mathbf{x}_j \right)\]
where $f_\Theta$ typically denotes a learnable function, e.g. a linear layer or a multi-layer perceptron.
Arguments
f
: A (possibly learnable) function acting on node features.ϵ
: Weighting factor.
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
# create dense layer
nn = Dense(in_channel, out_channel)
# create layer
l = GINConv(nn, 0.01f0, aggr = mean)
# forward pass
y = l(g, x)
GraphNeuralNetworks.GMMConv
— TypeGMMConv((in, ein) => out, σ=identity; K=1, bias=true, init=glorot_uniform, residual=false)
Graph mixture model convolution layer from the paper Geometric deep learning on graphs and manifolds using mixture model CNNs Performs the operation
\[\mathbf{x}_i' = \mathbf{x}_i + \frac{1}{|N(i)|} \sum_{j\in N(i)}\frac{1}{K}\sum_{k=1}^K \mathbf{w}_k(\mathbf{e}_{j\to i}) \odot \Theta_k \mathbf{x}_j\]
where $w^a_{k}(e^a)$ for feature a
and kernel k
is given by
\[w^a_{k}(e^a) = \exp(-\frac{1}{2}(e^a - \mu^a_k)^T (\Sigma^{-1})^a_k(e^a - \mu^a_k))\]
$\Theta_k, \mu^a_k, (\Sigma^{-1})^a_k$ are learnable parameters.
The input to the layer is a node feature array x
of size (num_features, num_nodes)
and edge pseudo-coordinate array e
of size (num_features, num_edges)
The residual $\mathbf{x}_i$ is added only if residual=true
and the output size is the same as the input size.
Arguments
in
: Number of input node features.ein
: Number of input edge features.out
: Number of output features.σ
: Activation function. Defaultidentity
.K
: Number of kernels. Default1
.bias
: Add learnable bias. Defaulttrue
.init
: Weights' initializer. Defaultglorot_uniform
.residual
: Residual conncetion. Defaultfalse
.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s,t)
nin, ein, out, K = 4, 10, 7, 8
x = randn(Float32, nin, g.num_nodes)
e = randn(Float32, ein, g.num_edges)
# create layer
l = GMMConv((nin, ein) => out, K=K)
# forward pass
l(g, x, e)
GraphNeuralNetworks.GatedGraphConv
— TypeGatedGraphConv(out, num_layers; aggr=+, init=glorot_uniform)
Gated graph convolution layer from Gated Graph Sequence Neural Networks.
Implements the recursion
\[\begin{aligned} \mathbf{h}^{(0)}_i &= [\mathbf{x}_i; \mathbf{0}] \\ \mathbf{h}^{(l)}_i &= GRU(\mathbf{h}^{(l-1)}_i, \square_{j \in N(i)} W \mathbf{h}^{(l-1)}_j) \end{aligned}\]
where $\mathbf{h}^{(l)}_i$ denotes the $l$-th hidden variables passing through GRU. The dimension of input $\mathbf{x}_i$ needs to be less or equal to out
.
Arguments
out
: The dimension of output features.num_layers
: The number of recursion steps.aggr
: Aggregation operator for the incoming messages (e.g.+
,*
,max
,min
, andmean
).init
: Weight initialization function.
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
out_channel = 5
num_layers = 3
g = GNNGraph(s, t)
# create layer
l = GatedGraphConv(out_channel, num_layers)
# forward pass
y = l(g, x)
GraphNeuralNetworks.GraphConv
— TypeGraphConv(in => out, σ=identity; aggr=+, bias=true, init=glorot_uniform)
Graph convolution layer from Reference: Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks.
Performs:
\[\mathbf{x}_i' = W_1 \mathbf{x}_i + \square_{j \in \mathcal{N}(i)} W_2 \mathbf{x}_j\]
where the aggregation type is selected by aggr
.
Arguments
in
: The dimension of input features.out
: The dimension of output features.σ
: Activation function.aggr
: Aggregation operator for the incoming messages (e.g.+
,*
,max
,min
, andmean
).bias
: Add learnable bias.init
: Weights' initializer.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = GraphConv(in_channel => out_channel, relu, bias = false, aggr = mean)
# forward pass
y = l(g, x)
GraphNeuralNetworks.MEGNetConv
— TypeMEGNetConv(ϕe, ϕv; aggr=mean)
MEGNetConv(in => out; aggr=mean)
Convolution from Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals paper. In the forward pass, takes as inputs node features x
and edge features e
and returns updated features x'
and e'
according to
\[\begin{aligned} \mathbf{e}_{i\to j}' = \phi_e([\mathbf{x}_i;\, \mathbf{x}_j;\, \mathbf{e}_{i\to j}]),\\ \mathbf{x}_{i}' = \phi_v([\mathbf{x}_i;\, \square_{j\in \mathcal{N}(i)}\,\mathbf{e}_{j\to i}']). \end{aligned}\]
aggr
defines the aggregation to be performed.
If the neural networks ϕe
and ϕv
are not provided, they will be constructed from the in
and out
arguments instead as multi-layer perceptron with one hidden layer and relu
activations.
Examples
g = rand_graph(10, 30)
x = randn(Float32, 3, 10)
e = randn(Float32, 3, 30)
m = MEGNetConv(3 => 3)
x′, e′ = m(g, x, e)
GraphNeuralNetworks.NNConv
— TypeNNConv(in => out, f, σ=identity; aggr=+, bias=true, init=glorot_uniform)
The continuous kernel-based convolutional operator from the Neural Message Passing for Quantum Chemistry paper. This convolution is also known as the edge-conditioned convolution from the Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs paper.
Performs the operation
\[\mathbf{x}_i' = W \mathbf{x}_i + \square_{j \in N(i)} f_\Theta(\mathbf{e}_{j\to i})\,\mathbf{x}_j\]
where $f_\Theta$ denotes a learnable function (e.g. a linear layer or a multi-layer perceptron). Given an input of batched edge features e
of size (num_edge_features, num_edges)
, the function f
will return an batched matrices array whose size is (out, in, num_edges)
. For convenience, also functions returning a single (out*in, num_edges)
matrix are allowed.
Arguments
in
: The dimension of input node features.out
: The dimension of output node features.f
: A (possibly learnable) function acting on edge features.aggr
: Aggregation operator for the incoming messages (e.g.+
,*
,max
,min
, andmean
).σ
: Activation function.bias
: Add learnable bias.init
: Weights' initializer.
Examples:
n_in = 3
n_in_edge = 10
n_out = 5
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s, t)
# create dense layer
nn = Dense(n_in_edge => n_out * n_in)
# create layer
l = NNConv(n_in => n_out, nn, tanh, bias = true, aggr = +)
x = randn(Float32, n_in, g.num_nodes)
e = randn(Float32, n_in_edge, g.num_edges)
# forward pass
y = l(g, x, e)
GraphNeuralNetworks.ResGatedGraphConv
— TypeResGatedGraphConv(in => out, act=identity; init=glorot_uniform, bias=true)
The residual gated graph convolutional operator from the Residual Gated Graph ConvNets paper.
The layer's forward pass is given by
\[\mathbf{x}_i' = act\big(U\mathbf{x}_i + \sum_{j \in N(i)} \eta_{ij} V \mathbf{x}_j\big),\]
where the edge gates $\eta_{ij}$ are given by
\[\eta_{ij} = sigmoid(A\mathbf{x}_i + B\mathbf{x}_j).\]
Arguments
in
: The dimension of input features.out
: The dimension of output features.act
: Activation function.init
: Weight matrices' initializing function.bias
: Learn an additive bias if true.
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
# create layer
l = ResGatedGraphConv(in_channel => out_channel, tanh, bias = true)
# forward pass
y = l(g, x)
GraphNeuralNetworks.SAGEConv
— TypeSAGEConv(in => out, σ=identity; aggr=mean, bias=true, init=glorot_uniform)
GraphSAGE convolution layer from paper Inductive Representation Learning on Large Graphs.
Performs:
\[\mathbf{x}_i' = W \cdot [\mathbf{x}_i; \square_{j \in \mathcal{N}(i)} \mathbf{x}_j]\]
where the aggregation type is selected by aggr
.
Arguments
in
: The dimension of input features.out
: The dimension of output features.σ
: Activation function.aggr
: Aggregation operator for the incoming messages (e.g.+
,*
,max
,min
, andmean
).bias
: Add learnable bias.init
: Weights' initializer.
Examples:
# create data
s = [1,1,2,3]
t = [2,3,1,1]
in_channel = 3
out_channel = 5
g = GNNGraph(s, t)
# create layer
l = SAGEConv(in_channel => out_channel, tanh, bias = false, aggr = +)
# forward pass
y = l(g, x)
GraphNeuralNetworks.SGConv
— TypeSGConv(int => out, k=1; [bias, init, add_self_loops, use_edge_weight])
SGC layer from Simplifying Graph Convolutional Networks Performs operation
\[H^{K} = (\tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2})^K X \Theta\]
where $\tilde{A}$ is $A + I$.
Arguments
in
: Number of input features.out
: Number of output features.k
: Number of hops k. Default1
.bias
: Add learnable bias. Defaulttrue
.init
: Weights' initializer. Defaultglorot_uniform
.add_self_loops
: Add self loops to the graph before performing the convolution. Defaultfalse
.use_edge_weight
: Iftrue
, consider the edge weights in the input graph (if available). Ifadd_self_loops=true
the new weights will be set to 1. Defaultfalse
.
Examples
# create data
s = [1,1,2,3]
t = [2,3,1,1]
g = GNNGraph(s, t)
x = randn(Float32, 3, g.num_nodes)
# create layer
l = SGConv(3 => 5; add_self_loops = true)
# forward pass
y = l(g, x) # size: 5 × num_nodes
# convolution with edge weights
w = [1.1, 0.1, 2.3, 0.5]
y = l(g, x, w)
# Edge weights can also be embedded in the graph.
g = GNNGraph(s, t, w)
l = SGConv(3 => 5, add_self_loops = true, use_edge_weight=true)
y = l(g, x) # same as l(g, x, w)
GraphNeuralNetworks.TAGConv
— TypeTAGConv(in => out, k=3; bias=true, init=glorot_uniform, add_self_loops=true, use_edge_weight=false)
TAGConv layer from Topology Adaptive Graph Convolutional Networks. This layer extends the idea of graph convolutions by applying filters that adapt to the topology of the data. It performs the operation:
\[H^{K} = {\sum}_{k=0}^K (D^{-1/2} A D^{-1/2})^{k} X {\Theta}_{k}\]
where A
is the adjacency matrix of the graph, D
is the degree matrix, X
is the input feature matrix, and ${\Theta}_{k}$ is a unique weight matrix for each hop k
.
Arguments
in
: Number of input features.out
: Number of output features.k
: Maximum number of hops to consider. Default is3
.bias
: Whether to include a learnable bias term. Default istrue
.init
: Initialization function for the weights. Default isglorot_uniform
.add_self_loops
: Whether to add self-loops to the adjacency matrix. Default istrue
.use_edge_weight
: Iftrue
, edge weights are considered in the computation (if available). Default isfalse
.
Examples
# Example graph data
s = [1, 1, 2, 3]
t = [2, 3, 1, 1]
g = GNNGraph(s, t) # Create a graph
x = randn(Float32, 3, g.num_nodes) # Random features for each node
# Create a TAGConv layer
l = TAGConv(3 => 5, k=3; add_self_loops=true)
# Apply the TAGConv layer
y = l(g, x) # Output size: 5 × num_nodes
GraphNeuralNetworks.TransformerConv
— TypeTransformerConv((in, ein) => out; [heads, concat, init, add_self_loops, bias_qkv,
bias_root, root_weight, gating, skip_connection, batch_norm, ff_channels]))
The transformer-like multi head attention convolutional operator from the Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification paper, which also considers edge features. It further contains options to also be configured as the transformer-like convolutional operator from the Attention, Learn to Solve Routing Problems! paper, including a successive feed-forward network as well as skip layers and batch normalization.
The layer's basic forward pass is given by
\[x_i' = W_1x_i + \sum_{j\in N(i)} \alpha_{ij} (W_2 x_j + W_6e_{ij})\]
where the attention scores are
\[\alpha_{ij} = \mathrm{softmax}\left(\frac{(W_3x_i)^T(W_4x_j+ W_6e_{ij})}{\sqrt{d}}\right).\]
Optionally, a combination of the aggregated value with transformed root node features by a gating mechanism via
\[x'_i = \beta_i W_1 x_i + (1 - \beta_i) \underbrace{\left(\sum_{j \in \mathcal{N}(i)} \alpha_{i,j} W_2 x_j \right)}_{=m_i}\]
with
\[\beta_i = \textrm{sigmoid}(W_5^{\top} [ W_1 x_i, m_i, W_1 x_i - m_i ]).\]
can be performed.
Arguments
in
: Dimension of input features, which also corresponds to the dimension of the output features.ein
: Dimension of the edge features; if 0, no edge features will be used.out
: Dimension of the output.heads
: Number of heads in output. Default1
.concat
: Concatenate layer output or not. If not, layer output is averaged over the heads. Defaulttrue
.init
: Weight matrices' initializing function. Defaultglorot_uniform
.add_self_loops
: Add self loops to the input graph. Defaultfalse
.bias_qkv
: If set, bias is used in the key, query and value transformations for nodes. Defaulttrue
.bias_root
: If set, the layer will also learn an additive bias for the root when root weight is used. Defaulttrue
.root_weight
: If set, the layer will add the transformed root node features to the output. Defaulttrue
.gating
: If set, will combine aggregation and transformed root node features by a gating mechanism. Defaultfalse
.skip_connection
: If set, a skip connection will be made from the input and added to the output. Defaultfalse
.batch_norm
: If set, a batch normalization will be applied to the output. Defaultfalse
.ff_channels
: If positive, a feed-forward NN is appended, with the first having the given number of hidden nodes; this NN also gets a skip connection and batch normalization if the respective parameters are set. Default:0
.
Examples
N, in_channel, out_channel = 4, 3, 5
ein, heads = 2, 3
g = GNNGraph([1,1,2,4], [2,3,1,1])
l = TransformerConv((in_channel, ein) => in_channel; heads, gating = true, bias_qkv = true)
x = rand(Float32, in_channel, N)
e = rand(Float32, ein, g.num_edges)
l(g, x, e)