History

Hongyi ZHANG 89fd3285b7 [Model] Attention Network (#117 ) * [Model] Update Attention Network * [Model] Update Attention Network * [Template] Create pr and issue template * [Model] Create model ieHGCN * [Model] Update Model ieHGCN * [Model] Implement model HGAT * [Model] Implement HGAT * [Model]Update Attention Network * [Model]Update init * [Model]Update init * [Model]Update init * [Model]Update init * [Docs]Update docs * [Docs]Update docs * [Model]Update model * [Model]Update model * [Model]Update init * [Docs]Update docs * [Model]Update init * [Docs]Update docs * [Model]Update init * [Model]Update init * [Model]Update init Co-authored-by: dddg617 <996179900@qq.com> Co-authored-by: dddg617 <75086617+dddg617@users.noreply.github.com>	1 year ago
..
README.md	[Model] Attention Network (#117)	1 year ago

* [Model] Update Attention Network

* [Model] Update Attention Network

* [Template] Create pr and issue template

* [Model] Create model ieHGCN

* [Model] Update Model ieHGCN

* [Model] Implement model HGAT

* [Model] Implement HGAT

* [Model]Update Attention Network

* [Model]Update init

* [Model]Update init

* [Model]Update init

* [Model]Update init

* [Docs]Update docs

* [Docs]Update docs

* [Model]Update model

* [Model]Update model

* [Model]Update init

* [Docs]Update docs

* [Model]Update init

* [Docs]Update docs

* [Model]Update init

* [Model]Update init

* [Model]Update init

Co-authored-by: dddg617 <996179900@qq.com>
Co-authored-by: dddg617 <75086617+dddg617@users.noreply.github.com>

1 year ago

..

README.md

[Model] Attention Network (#117)

1 year ago

README.md

Attention Network

Model	Paper
HGT(WWW 2019)	Heterogeneous Graph Transformer
SimpleHGN(KDD 2021)	Are we really making much progress? Revisiting, benchmarking,and refining heterogeneous graph neural networks
HetSANN(AAAI 2020)	An Attention-Based Graph Neural Network for Heterogeneous Structural Learning
ieHGCN(TKDE 2021)	Interpretable and Efficient Heterogeneous Graph Convolutional Network

Attention mechanism

This part, we will give the definition of attention methanism based on GAT and Transformer.

In GAT, it defined the attentional mechanism. A shared linear transformation, parametrized by a weight matrix, $W\in\mathcal{R}^{F^{'}\times F}$, is applied to every node. Then use a shared attentional mechanism $a: \mathcal{R}^{F^{'}}\times \mathcal{R}^{F}\rightarrow \mathcal{R}$ to compute attention coefficients:

$$
e_{ij} = a(Wh_i, Wh_j)
$$

this indicate the importance of node $j$'s features to node $i$. $a$ is a single-layer feedforward neural network. Finally we can normalize them across all choices of $j$ using the softmax function:

$$
\alpha_{ij} = softmax_j(e_{ij}) = \frac{\text{exp}(e_{ij})}{\sum_{k\in \mathcal{N}i} \text{exp}(e{ik})}
$$

In Transformer, an attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The output is computed as a weighted sum of the values, where the weight assigned to each value is computed by a compatibility function of the query with the corresponding key. e.g. Scaled Dot-Product Attention:

$$
Attention(Q, K, V) = softmax(\frac{QK^T}{\sqrt{d_k}})V
$$

DGL API

This part, we will give DGL API we used. As DGL released 0.8.0 version, more API can support heterogeneous graph such as TypedLinear, HeteroLinear. So we will give some details of these APIs.

TypedLinear

class TypedLinear(in_size, out_size, num_types, regularizer=None, num_bases=None)

Apply linear transformation according to types.

Parameters:

in_size(int): Input feature size.
out_size(int): Output feature size.
num_types(int): Number of types(node or edge).
regularizer(str, optional): Which weight regularizer to use “basis” or “bdd”, default is None:
- basis: basis-decomposition.
- bdd: block-diagonal-decomposition.
num_bases(int, optional): Number of bases. Needed when regularizer is specified. Typically smaller than num_types. Default: None.

forward(x, x_type, sorted_by_type=False)

Parameters:

x(tensor): The input tensor.
x_type(tensor): 1D tensor storing the type of the element in x.
sorted_by_type(boolean): Whether the inputs have been sorted by the types. Forward on pre-sorted inputs may be faster.

So this API can be used when we use to_homogeneous to convert a heterogeneous graph to a homogeneous graph.

HeteroLinear

class HeteroLinear(in_size, out_size, bias=True)

Apply linear transformations on heterogeneous inputs.

Parameters:

in_size(dict[key, int]): Input feature size for heterogeneous inputs. A key can be a string or a tuple of strings.
out_size(int): Output feature size.
bias(boolean): If True, learns a bias term.

forward(feat)

Parameters:

feat(dict[key, tensor]): Heterogeneous input features.

So this API can be used if we want to apply different linear transformations to different types.

HeteroGraphConv

class HeteroGraphConv(mods, aggregate='sum')

The heterograph convolution applies sub-modules on their associating relation graphs, which reads the features from source nodes and writes the updated ones to destination nodes. If multiple relations have the same destination node types, their results are aggregated by the specified method. If the relation graph has no edge, the corresponding module will not be called.

Parameters:

mods(dict[str, nn.Module]): Modules associated with every edge types.
aggregate (str, callable, optional): Method for aggregating node features generated by different relations. Allowed string values are ‘sum’, ‘max’, ‘min’, ‘mean’, ‘stack’. User can also customize the aggregator by providing a callable instance.

forward(g, inputs, mod_args=None, mod_kwargs=None)

Parameters:

g(DGLHeteroGraph) – Graph data.
inputs(dict[str, Tensor] or pair of dict[str, Tensor]) – Input node features.

So this API can be used when we need to get relation subgraphs and apply nn.Module to each subgraph.

Typical model

Based on HeteroGraphConv, we divide the attention model into two categories: Direct-Aggregation models and Dual-Aggregation models.

Direct-Aggregation models

Model	Attention coefficient
HGT	$W_{Q_{\phi{(s)}}}h_s W^{ATT}{\psi{(r)}}(W{K_{\phi{(t)}}}h_t)^T$
SimpleHGN	$LeakyReLU(a^T[Wh_s \parallel Wh_t \parallel W_r r_{\psi(<s,t>)}])$
HetSANN	$LeakyReLU([W_{\phi(t),\phi(s)} h_s\parallel W_{\phi(t),\phi(s)} h_t]a_r)$

These models only have one aggregation process and do not distinguish between types of edges when aggregating, so they are not suitable for HeteroGraphConv.

Dual-aggregation model

ieHGCN

Model	Attention coefficient
ieHGCN	$ELU(a^T_{\phi(s)}[W_{Q_{\phi(s)}}h_s\parallel W_{K_{\phi(t)}}h_t])$

This model has two aggregation process and distinguish between types of edges when aggregating, so this is suitable for HeteroGraphConv.

Implement Details

Direct-Aggregation models

We first implement the convolution layer of the model SimpleHGN, and HetSANN. The convolutional layer of HGT we use is hgtconv. The __init__ parameters can be different as the models need different parameters.
The parameters of the forward part are the same: g is the homogeneous graph, h is the features, ntype denotes the type of each node, etype denotes the type of each edge, presorted tells if the ntype or etype is presorted to use TypedLinear in dgl.nn conveniently. If we use dgl.to_homogeneous to get the features, the features are presorted.
Then, we use the convolution layers to implement corresponding models. We need dgl.to_homogeneous to get a homogeneous graph as when we use edge_softmax, we put all the edges together to calculate the attention coefficient instead of distinguishing the type of edges.
After passing the convolution layers, we need to convert the output features to a feature dictionary in a heterogeneous graph. We designed a tool in openhgnn.utils.utils.py named to_hetero_feat. This is because we do not have a better solution to get a feature dictionay using dgl. We can only use dgl.to_heterogeneous, but it has many additional operations to make the programs slowly. After we get the feature dictionary, the model is complete.

Dual-Aggregation model

We refer to the idea of the implementation of dgl.nn.HeteroGraphConv. We extract the relationship subgraph based on the edge type and complete the aggregation using the convoluntion layers. Then, to aggregate type-specific features across different relations we have to compute attention coefficients step by step.

How to run

Clone the Openhgnn-DGL

# For node classification task
# You may select model HGT, SimpleHGN, HetSANN
python main.py -m HGT -t node_classification -d imdb4MAGNN -g 0 --use_best_config

If you do not have gpu, set -gpu -1.

Performance

Task: Node classification

Evaluation metric: Micro/Macro-F1

	HGBn-ACM		acm4GTN		imdb4MAGNN		dblp4MAGNN
Model	Micro-F1	Macro-F1	Micro-F1	Macro-F1	Micro-F1	Macro-F1	Micro-F1	Macro-F1
HGT	88.95	89.18	90.21	90.24	49.37	49.18	87.23	86.46
SimpleHGN	92.27	92.36	89.27	89.28	52.25	48.78	87.72	87.08
HetSANN	88.4	88.7	92.24	92.31	52.88	47.44	89.54	90.24
ie-HGCN	91.71	91.99	92.47	92.56	55.03	52.18	88.36	87.37

TrainerFlow: node classification flow

Hyper-parameter specific to the model

You can modify the parameters [HGT], [SimpleHGN], [HetSANN], [ieHGCN] in openhgnn/config.ini.

More

Contirbutor

Yaoqi Liu[GAMMA LAB]

If you have any questions,

Submit an issue or email to YaoqiLiu@bupt.edu.cn.

OpenHGNN是由北邮GAMMA Lab开发的基于PyTorch和DGL的开源异质图神经网络工具包。

pytorch 图神经网络 dgl gnn heterogeneous 异质图

Python Markdown Shell

34649403+Theheavens@users.noreply.github.com

yuyue1218@bupt.edu.cn 502736806@qq.com 75086617+dddg617@users.noreply.github.com 48177103+zsy0828@users.noreply.github.com 1064126026@qq.com 50618951+Zhanghyi@users.noreply.github.com 45940459+VoidHaruhi@users.noreply.github.com 52286916+liushiliushi@users.noreply.github.com

925089962@qq.com 836755889@qq.com 48872043+guyuisland@users.noreply.github.com 34417529+ZhaiJojo@users.noreply.github.com 53176096+Bunnyqiqi@users.noreply.github.com 37235626+AliciaaaWang@users.noreply.github.com

“836755889@qq.com” coin2028@hotmail.com 153576023@qq.com 38311056+siyongxu@users.noreply.github.com lspongebobjh@gmail.com

How to access data resources in code

README.md

Attention Network

Attention mechanism

DGL API

TypedLinear

HeteroLinear

HeteroGraphConv

Typical model

Direct-Aggregation models

Dual-aggregation model

ieHGCN

Implement Details

Direct-Aggregation models

Dual-Aggregation model

How to run

Performance

Task: Node classification

TrainerFlow: node classification flow

Hyper-parameter specific to the model

More

Contirbutor

If you have any questions,

Contributors (25+) All

Contributors (25+)
All