0% found this document useful (0 votes)
11 views14 pages

Graphormer=a General-propose Backbone for Graph Learning

Uploaded by

kekun.hu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views14 pages

Graphormer=a General-propose Backbone for Graph Learning

Uploaded by

kekun.hu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

GRAPHORMER

-- A GENERAL-PROPOSE BACKBONE FOR GRAPH LEARNING

Shuxin Zheng, Microsoft Research Asia


shuz@microsoft.com

https://github.com/microsoft/Graphormer
TRANSFORMER BECOMES DOMINANT ON SEQUENCE DATA

Speech

Sequence Data (1D): … Language

Protein

Image
Grid Data (2D) :
Video

Today’s Transformer Model:


Become Bigger, Deeper, Wider
GNN IS STILL THE FIRST CHOICE FOR GRAPH DATA

Molecule

Graph Data: Social Network

Combinatorial Optimization

Today’s Graph Neural Network:

Keep Slim, Shallow,


and Simple Operations GIN: 3-5 layers
Operations: Sum + 2-layer FFN
EXPRESSIVENESS VS. CAPABILITY OF MODELING GRAPH

1. Graft Existing Modules to GNN

2. Transform Graph to Sequence

3. Modify Transformer by Heuristic on Graph


TRANSFORMER ON GRAPH
Set:

Self-Attention: Calculate Correlation Between Tokens/Patches…

Sequence Data: … Grid Data:


Relative Positional Encoding[1]: 2D Relative Positional Encoding: …

[1] Ke, Guolin, Di He, and Tie-Yan Liu. "Rethinking the Positional Encoding in Language Pre-training." ICLR(2021)
KEY INSIGHT: STRUCTURAL ENCODINGS
Set:
Self-Attention: Calculate Correlation Between Nodes…

What affects the Correlation between Nodes:


Spatial Position Centrality Edge Feature


GRAPHORMER

= Pure Transformer

+ Spatial Encoding

+ Centrality Encoding

+ Edge Encoding
GRAPHORMER: SPATIAL ENCODING

+𝑏𝜙(𝑣𝑖,𝑣𝑗)
Spatial Position

𝜙 𝑣𝑖 , 𝑣𝑗 : Any Metric that Measures the Distance Between vi & vj.

3D Euclidean Distance
Unweighted Shortest Path

Weighted Shortest Path Max Flow


GRAPHORMER: CENTRALITY & EDGE ENCODINGS
Node Centrality: Degree

Edge Encoding:
𝒗𝒊
𝒗𝒋
𝒗𝒌

+𝒃𝝓(𝒗𝒊 ,𝒗𝒋 ) + 𝒄𝒊𝒋


𝟏
𝒄𝒊𝒋 = (𝒆𝒊𝒌 𝒘𝑻𝟏 + 𝒆𝒌𝒋 𝒘𝑻𝟐 )
𝟐
HOW POWERFUL IS GRAPHORMER?
Why Graphormer is Better? Theoretical Facts:

Special Cases

Graphormer GIN, GCN, GraphSage …


Example: Mean Aggregation
Let:
𝑊𝑄 = 𝑊𝐾 = 0, 𝑊𝑣 = 𝐼,

𝑏𝜙(𝑣𝑖 ,𝑣𝑗 ) = 0, if 𝑣𝑖 and 𝑣𝑗 are neighbor, else 𝑏𝜙(𝑣𝑖 ,𝑣𝑗 ) = −∞,


+𝒃𝝓(𝒗𝒊 ,𝒗𝒋 ) + 𝒄𝒊𝒋
𝑐𝑖𝑗 = 0.
HOW POWERFUL IS GRAPHORMER?
Why Graphormer is Better? Theoretical Facts:

Expressiveness: Graphormer > 1-WL Test ≥ Graph Neural Network


Spatial Encoding

(a) (b)
KDD CUP 2021 – 1 ST PLACE AWARD
BIOASSAY
https://github.com/microsoft/Graphormer

FUTURE APPLICATIONS… AND THANKS!

You might also like