Mathematics 11 03541
Mathematics 11 03541
Mathematics 11 03541
Article
Link Prediction for Temporal Heterogeneous Networks Based
on the Information Lifecycle
Jiaping Cao , Jichao Li * and Jiang Jiang
College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
* Correspondence: lijichao09@nudt.edu.cn
Abstract: Link prediction for temporal heterogeneous networks is an important task in the field
of network science, and it has a wide range of real-world applications. Traditional link prediction
methods are mainly based on static homogeneous networks, which do not distinguish between
different types of nodes in the real world and do not account for network structure evolution over
time. To address these issues, in this paper, we study the link prediction problem in temporal
heterogeneous networks and propose a link prediction method for temporal heterogeneous networks
(LP-THN) based on the information lifecycle, which is an end-to-end encoder–decoder structure. The
information lifecycle accounts for the active, decay and stable states of edges. Specifically, we first
introduce the meta-path augmented residual information matrix to preserve the structure evolution
mechanism and semantics in HINs, using it as input to the encoder to obtain a low-dimensional
embedding representation of the nodes. Finally, the link prediction problem is considered a binary
classification problem, and the decoder is utilized for link prediction. Our prediction process accounts
for both network structure and semantic changes using meta-path augmented residual information
matrix perturbations. Our experiments demonstrate that LP-THN outperforms other baselines in
both prediction effectiveness and prediction efficiency.
MSC: 68T07
2. Related Work
Practical solutions to link prediction problems have become a popular research field
for many scholars in network science. Linyuan Lv summarizes the existing heuristic link
prediction methods and classifies them as local information-based heuristics, such as AA [6]
and RA [7]; path-based heuristics, such as Katz [8] and LHN-II [9]; and random-walk-based
heuristics, such as Cos+ [10] and SimRank [11]. The above heuristics are mostly from
the perspective of network structure and have high interpretability, but they have low
prediction accuracy and are not applicable to large-scale networks. With the rapid develop-
ment of artificial intelligence technology, link prediction methods represented by network
embedding techniques have received wide attention. DeepWalk [12] is a classical shallow
neural network embedding method that obtains node sequences by random walks and
inputs those node sequences into Word2vec to learn the low-dimensional embedding repre-
Mathematics 2023, 11, 3541 3 of 17
sentations of the nodes. Node2Vec [13] is based on DeepWalk and defines a biased random
walk that combines both breadth-first and depth-first approaches to obtain node sequences.
M-NMF [14] preserves the topology and the community structure of the network and com-
bines the two through a nonnegative matrix decomposition to obtain node low-dimensional
embedding representations. Matrix decomposition and shallow neural network embedding
methods have high computational complexity when dealing with large-scale networks.
LINE [15], in contrast, maximizes the conditional probability and co-occurrence probability
of node neighbours to learn the low-dimensional embedding representations of nodes,
allowing it to efficiently handle large-scale networks. The above methods mainly focus
on the local structural information in the network, ignoring the rich node attributes as
well as the global structural information of the network. In recent years, many scholars
have proposed methods based on deep neural networks that can combine the rich attribute
information of nodes to learn their low-dimensional embedding representations, e.g., graph
neural networks (GNNs) [16], GraphSAGE [17], graph convolutional neural networks
(GCNs) [18] and graph attention networks (GATs) [19]. However, the above methods are
performed on homogeneous networks that do not account for the rich semantic information
contained in different nodes and edge types in heterogeneous networks.
To address the above problems, the link prediction problem in heterogeneous networks
has received extensive attention [20]. Many scholars extract important semantic information
in the network based on meta-paths and embed this information into neural networks
to obtain low-dimensional embedding representations of nodes containing rich semantic
information. Node representation methods on heterogeneous networks can be classified
into three categories: shallow neural networks, such as metapath2vec [21] and HIN2Vec [22],
matrix decomposition methods, such as HNE [23], and deep neural networks, such as
HetGNN [24] and HERec [25]. However, these methods are based on link prediction for
static networks and do not account for the evolution of the network structure over time.
Numerous link prediction methods for temporal networks have been proposed. TLPSS [26]
accounts for the existence of higher-order structural information in the network through
a simplicial complex but does not consider the heterogeneity of nodes and edges. Lasso
regression and random forest [27] find that the connection of a node pair depends on the
connection of many node pairs in the past over a period, explaining the structure of the
model. DynamicTriad [28] learns low-dimensional embedding representations of nodes
through triadic closure. In this method, the dynamics refer to adding edges or changing the
weight of an edge that represents closeness between two nodes, and the number of nodes
does not change over time. The dynamics in CTDN [29] refer to continuous time variation.
This method considers the temporal attributes of the edges to obtain a sequence of nodes
by random walk. DyLink2Vec [30] directly obtains the embedding of the edges to perform
link prediction. TempNodeEmbed [31] splices the same nodes in different time slices by
computing orthogonal bases between node matrices in different time slices. However,
the above link prediction methods for temporal networks are based on homogeneous
networks and do not account for the rich semantic information contained in heterogeneous
networks. DHNE [32] performs a random walk by constructing a historical–current graph
and combining it with meta-paths, but constructing the historical–current graph requires
overly high computational complexity. DyHNE [33] captures network structures and
semantics by retaining first-order and second-order neighbours based on meta-paths, but
this neighbourhood structure cannot capture global structural information.
the heterogeneous network under the time slice t, where V t denotes the set of nodes of the heteroge-
neous network under the time slice t, and Et denotes the set of connected edges of the heterogeneous
network under the time slice t. In addition, φ and ϕ are two mapping equations, φ: V t → A,
ϕ: Et → R, where A and R denote the sets of types of nodes and edges, respectively. For a
heterogeneous network, | A| + | R| > 2.
a1 p1
Author A Author B Author A Author C
c1
a2 p2 Co-authorship Co-authorship
(2000–2005 year) (2005 year)
c2
a3 p3
Figure 1. Example of (A) an academic co-authorship network and (B) co-authorship instances.
Meta-path-based first-order
similarity denotes the instance of meta-path M between
connected node pairs vi , v j , which can measure the local structural similarity between
nodes in a heterogeneous network. Second-order similarity based on meta-paths denotes
the instance of meta-path M between a neighbour node N (vi ) M of node vi and a neigh-
M
bour node N v j of node v j , where a neighbour node N (vi ) M of node vi contains nodes
connected to node vi based on meta-path M.
The adjacency matrix describes only the first-order adjacencies present in the network,
and the meta-paths alone cannot characterize the network structure over time, so the
residual information on the edges in the network is fused with the meta-paths. The meta-
path augmented residual information matrix is proposed to portray the temporal structural
information in the network:
In temporal network link prediction, the network structure continuously evolves over
time, and information about the historical structural changes in the network is used to
predict the edges in the future network. Temporal network link prediction is typically
defined as follows:
Definition 4. Temporal network link prediction. Temporal network link prediction means predict-
ing the set of edges in the network at time T + 1, i.e., E T +1 = et (u, v)|u, v ∈ V t , t = T + 1 ,
based on the network at the given time slice [1, T ]. Given the time t, s ∈ T, t < s, if the nodes
and edges in G t do not disappear during the period [t, s], then the nodes and edges in G t are all
represented in G s , and the information on G t is said to be the history of G s , as shown in Figure 2.
Since temporal network link prediction focuses more on the characteristics of network structure
evolution over time and ignores the heterogeneity of node and edge types in the network, this
paper only gives a more general definition. For the notion of temporal heterogeneous network link
prediction, it is sufficient to ensure that the number of node or edge types in this definition is not 1.
1 2 t s T T +1 time
t1 = tz−m , . . . , tz−1 , t2 = tz , tz+1 , . . . , tn . The weight of edge eij in the aggregated network
is the residual information nASFeij , which portrays the lifecycle of the information, and
the residual information decreases from the beginning of the generation of edge eij to the
current moment. This residual information does not decrease linearly but undergoes the
process of first gently decreasing, then dramatically decreasing, and finally tending towards
gently maintaining. This formula is as shown in Equation (2):
1
1+exp{ aij /p − a}
+q
nASFeij aij , tij = tij · (2)
q+1
where aij denotes the time span of the time slice tz to tn in which the edge was generated,
and aij = n − z + 1. In particular, an edge eij generated at time ts is aij = n − z + 1. If
the edge reappears at time tz (s < z), tij is the number of times an edge eij occurs. This
parameter is set to distinguish between the two cases shown in Figure 1B. When author A
and author B cooperate on five consecutive time slices [t1 , t5 ] and author A and author C
first cooperate at time slice t5 , the value of the parameter a between author A and author
C as well as between author A and author B is 1. Therefore, if only the parameter aij is
considered, the two cases shown in Figure 1B cannot be differentiated between, while the
parameter tij can differentiate between these two cases.
After obtaining the aggregated network G, the meta-path instances are extracted
according to the determined meta-paths. Taking the academic collaboration network in
Figure 1A as an example, the instances of meta-path APA and meta-path APCPA are
extracted. The meta-path APA indicates that the two authors coauthored a paper, such as
a1 p1 a2 in Figure 1A, and the meta-path APCPA indicates that the paper authored by the
two authors was published in the same conference, such as a1 p1 c2 p2 a3 in Figure 1A. The
meta-path APA and meta-path APCPA augmented residual information matrices W APA
and W APCPA are calculated by Equations (1) and (2). Inputting W APA and W APCPA into the
modified self-attention model, which can be used to calculate the correlation between any
two nodes in the graph, yields a relationship-enhanced meta-path M augmented residual
information matrix W d M. W\ APA and W\ APCPA are input into the modified GCN model.
Guided by the modified self-attention model, modified GCN can change the degree of
aggregation between different nodes to obtain a meta-path augmented residual information
matrix H M after aggregating the local information based on the meta-path.
Next, the information in H APA and H APCPA is aggregated through the attention mecha-
nism. The adaptive weights w M1 and w M2 are set, the initial values are randomly generated,
and the augmented residual information matrix H is calculated according to Equation (3)
to obtain each node embedding representation. The encoder of LP-THN is as mentioned
above. The decoder inputs the augmented residual information matrix H into the modified
GAT model to obtain the augmented residual information matrix fusing semantic and
temporal information. Finally, a multilayer perceptron is used for link prediction, and the
GAT model can account for the variability of the influence of different neighbourhood
nodes on the core node.
H = w M1 H M1 + w M2 H M2 + · · · + w Mn H Mn (3)
Mathematics 2023, 11, 3541 7 of 17
Meta-path
Network Aggregation
Selection
Meta-path Augmented Information Matrix Encoder Decoder
timestep = 1
Xtimes W APA W APA
Modified
timestep = 2 APA self-attention
APA
= ASF
timestep = n Modified H APA Modified
GAT
= ASF APCPA GCN
age, times wM1
wM 2
output
Aggregated Network
Modified
GCN
H APCPA H
APCPA Modified
Xtimes self-attention
Figure 3. Diagram of the proposed LP-THN model. This model contains network aggregation,
meta-path selection, meta-path augmented information matrix, encoder and decoder steps. In the
first step, the temporal heterogeneous network is aggregated into a single-layer aggregated network,
which sets the age and times as the weights of edges. In the second step, meta-paths are selected
according to the semantics within the temporal dataset. In the third step, the meta-path augmented
information matrix W M is calculated, preserving structure evolution and semantics. In the fourth
step, an encoder is used to obtain the low-dimensional embedding representation of nodes. Finally, a
decoder is applied to perform link prediction.
4.2. Encoder
4.2.1. Modified Self-Attention Mechanism
Self-attention mechanisms are widely used in the field of natural language processing,
which exploits its ability to capture the internal correlation of data and applies it to graphs
to convert the influence of other nodes in the graph on the core node into the degree of
attention of the core node on other nodes [34]. The self-attention mechanism based on
meta-paths and temporal structural features is implemented in three main steps. In the
first step, the query, key and value of each node are calculated based on the meta-path M,
as shown in Equations (4)–(6); in the second step, the magnitude of the relevance of each
node to the current node is calculated and normalized, and the result is used as the weight
of each node to the core node, as shown in Equation (7); and in the third step, the value
of each node is weighted to the core node by summation to obtain the representation of
each node considering the global node to core node importance information, as shown in
Equation (7).
K M = WKM · W M (4)
Q M = WQM · W M (5)
V M = WVM · W M (6)
where the meta-path M augmented information matrix W M ∈ Rn×n and n = VRi , VRi
denotes the set of nodes in the aggregation network G with node type Ri . WKM ∈ Rd×n ,
WQM ∈ Rd×n and WVM ∈ Rd×n are generated by random initialization, and d is the node
embedding representation dimension. To improve the computational speed, matrix opera-
tions are used. K M ∈ Rd×n , Q M ∈ Rd×n and V M ∈ Rd×n denote the query, key and value
matrices of graph G based on meta-path M.
T
!
QM · KM
M
W = so f tmax
d √ VM (7)
d
Mathematics 2023, 11, 3541 8 of 17
T
where Q M · K M = scoreij , scoreij denotes node ni ’s attention to node n j , d is the dimen-
√
sion of the node’s embedding representation, and the division by d stabilizes the gradient.
To calculate the degree of influence of each node on the core node, the activation function
with so f tmax (·) is used for the normalization process, and the result obtained from the
normalization process is used as the weight of the node on the core node’s value.
The meta-path M augmented information matrix in this paper is based on the meta-
path set to reconstruct the network structure, which accounts for the rich semantic informa-
tion in the heterogeneous network. Simultaneously, the elements in the matrix account for
the rich temporal information contained in each edge. Therefore, the matrix is used as an
input to the self-attention mechanism that considers the amount of information. In contrast
to the original self-attention mechanism, the modified self-attention model not only enables
the learned results to consider the magnitude of the relevance of different nodes to that
node but also accounts for the temporal attributes of the dynamic heterogeneous graph.
Since the relationships between nodes are generated based on defined meta-paths, this
self-attention mechanism can enhance the main structural features in the heterogeneous
graph and weaken the unimportant structural features, ensuring that the learned node
embedding representations contain specific semantic information.
adjacency matrix. If node ni and node n j based on meta-path M have links, then aij = 1;
otherwise, aij = 0. Because the node itself does not have a meta-path based on the formation
of the meta-path M, aii = 0, and the node’s own information cannot be aggregated when the
node feature is updated. To avoid this situation, the node’s neighbours can be considered
using unit matrices I ∈ Rn×n so that b aii = 1. Therefore, the node feature update can
consider its own features. D = [dii ], dii is the degree of node ni , and the feature vector of
neighbour nodes of node ni is the weighted average because low degrees of nodes on the
neighbours are assumed to have a greater impact; and σ () represents an arbitrary activation
function. In this paper, ReLU is selected as the activation function and l represents the
number of layers of modified GCN. If a layer of the modified GCN is set up, it will lead
to the node features of the neural network, causing each node feature update to consider
only the features of its neighbouring nodes. If too many of its layers are set, the sense of
the field will be too large for each node feature update. The lack of joints will affect the
node, resulting in a decline in the effect of the last layer of l N generated by X (l N ) = H M ,
where H M is the meta-path augmented information matrix after aggregation of the local
information based on the meta-path, and W (l −1) is used for the random initialization of the
generated weights matrix.
The structural information contained in the relationship-enhanced meta-path M aug-
mented informativeness matrix W dM accounts for the neighbourhood information generated
based on the meta-path M, the relevance of each node to the core node, and the decreasing
amount of information carried by edges over time. Because of this, it is used as an input to
the graphical convolutional neural network based on meta-paths and temporal attributes,
Mathematics 2023, 11, 3541 9 of 17
which guides the nodes to rely on the meta-path-based neighbourhood information when
they aggregate the neighbourhood information and accounts for the influence of each node
to the core node. Meanwhile, GCN based on meta-paths and temporal attributes accounts
for the change in information carried on each edge over time when updating the node
features, which can make the node features include temporal attributes.
4.3. Decoder
In this paper, the link prediction problem is considered a binary classification problem.
The information in the first n snapshot is used to predict whether there is an edge between
two nodes in the first n + 1 snapshot.
Two possible labels are given to the edges. If there is
a link between two nodes ni , n j , it is considered to be a positive sample, and its label is
1; otherwise, the label is 0. The existing edges in the n + 1 time slice are taken as positive
samples. Random sampling is used to obtain the same number of negative samples as
positive samples, where the negative samples are the edges that do not exist in the n + 1
snapshot. The multilayer perceptron MLP(·) is used to predict the existence of connected
edges between two nodes in the future network, as shown in Equation (12).
yb = MLP H b (12)
Mathematics 2023, 11, 3541 10 of 17
The information loss during model iteration is calculated using the binary cross-
entropy loss function, as shown in Equation (13).
1
n∑
loss = − [y ln yb + (1 − y) ln(1 − yb)] (13)
5. Experiments
In this section, we describe more comprehensive experiments to illustrate the superiority
of the LP-THN model over other models in terms of prediction results. First, LP-THN and
other low-dimensional embedding representation methods are compared. Second, the superior
values of each parameter of LP-THN are determined through parameter sensitivity analysis.
embedding representation based on a deep neural network (i.e., Deeplink). Except meta-
path2vec, the other six methods are applied on static homogeneous networks, while
metapath2vec is a method applied on static heterogeneous networks. XGBoost is used as a
classifier for link prediction of baselines.
• DeepWalk [12]: node sequences are obtained by a random walk, and the node se-
quences are then input into Word2vec to obtain low-dimensional embedded represen-
tations of the nodes.
• Node2Vec [13]: node sequences are obtained by biased random walk, and the node
sequences are used as input to Word2vec to obtain low-dimensional embedded repre-
sentations of the nodes.
• LINE [15]: node embeddings are learned by maximizing the similarity between a node
and its first-order and second-order neighbours.
• M-NMF [14]: based on nonnegative matrix partitioning, which captures the commu-
nity structure in the graph as well as the similarity between nodes.
• Deeplink [38]: a deep learning method for node embedding representation that uses a
deep convolutional neural network to learn node embedding representation.
• Struc2Vec [39]: the context sequence of a node is constructed by traversing the depth-
first search path of each node, and the sequence is then fed into Word2vec to obtain
the node’s low-dimensional embedding representation.
• Metapath2vec [21]: node sequences are obtained by a random meta-path-based walk,
and the sequences are fed into Word2vec to obtain a representation of node embed-
dings in heterogeneous networks.
In this paper, three common evaluation metrics, AUC, F1 and ACC, are chosen to
evaluate the performance of these methods. The AUC can be interpreted as the probability
that the similarity value of an existing randomly selected edge is larger than the similarity
value of a non-existing randomly selected edge. F1 can be regarded as a weighted average
of the model’s accuracy and recall. The ACC is the ratio of the number of correctly predicted
samples to the total number of predicted samples. The larger the values of the AUC, F1
and ACC are, the better the model performs.
Metapath2vec accounts for the semantic information contained in the network. However, it
does not perform as well as the LP-THN model because it does not account for the temporal
information contained in the network. (c) The results of LP-THN- 1st are better than the
results of LP-THN-2nd because LP-THN- 1st considers only the first-order similarity and
LP-THN- 2nd considers only the second-order similarity. The node relationships extracted
from first-order similarity meta-paths are closer than those extracted from second-order
similarity meta-paths and therefore more informative for link prediction.
Datasets Metric DeepWalk Node2Vec LINE M-NMF Deeplink Struc2Vec Metapath2vec LP-THN-1st LP-THN-2nd LP-THN
AUC 0.9380 0.9625 0.9482 0.9415 0.9398 0.9332 0.9623 0.9912 0.9903 0.9934
AMiner F1 0.9403 0.9632 09493 0.9418 0.942 0.9358 0.9631 0.9627 0.9355 0.9686
ACC 0.9380 0.9625 0.9482 0.9415 0.9398 0.9332 0.9624 0.9728 0.9386 0.969
AUC 0.8539 0.9675 0.969 0.9223 0.9559 0.9594 0.9667 0.9844 0.9778 0.9848
Last.FM F1 0.8819 0.9707 0.9717 0.9318 0.9608 0.9638 0.9701 0.9768 0.9721 0.9768
ACC 0.8604 0.9687 0.9699 0.9249 0.9576 0.9610 0.9680 0.9752 0.9726 0.9761
AUC 0.8832 0.8576 0.9744 0.8967 0.9068 0.9345 0.9872 0.9913 0.9935 0.9945
MovieLens F1 0.8861 0.8642 0.9737 0.9000 0.9014 0.9333 0.9867 0.9927 0.9921 0.9981
ACC 0.8816 0.8553 0.9737 0.8947 0.9079 0.9342 0.9868 0.9933 0.9872 0.9860
The bold represents the best AUC/F1/ACC score within a network.
Mathematics 2023, 11, 3541 14 of 17
1.0
0.9
0.8
AUC
0.7
AMiner
0.6
Last.FM
MovieLens
ofinstances
of selected meta-paths. The time complexity of the modified GAT module is
2 2
O V M d + O E M d . The summarized time complexity of LP-THN is O V M d +
2
O EM + O V M d + O EM d .
6. Conclusions
In this paper, we study the link prediction problem on temporal heterogeneous net-
works and propose an encoder–decoder framework, LP-THN. The semantic information
contained in meta-paths in temporal networks is captured by a proposed meta-path aug-
mented residual information matrix that portrays the dynamic process of network structure
changes over time. Considering the lifecycle of the information carried by meta-paths
enables LP-THN to account for both the rich semantic information in heterogenous net-
works and the evolution of network structure over time. Experiments demonstrate that the
LP-THN model proposed in this paper outperforms existing baseline frameworks. In our
future work, We plan to extend our model to large-scale networks and consider aggregating
the attributes of nodes into the embedding process, and process data in temporal networks
in real time.
Mathematics 2023, 11, 3541 16 of 17
Author Contributions: Conceptualization, J.C.; methodology, J.C.; software, J.C.; validation, J.C.;
formal analysis, J.C.; investigation, J.C.; resources, J.C.; data curation, J.C.; writing—original draft
preparation, J.C.; writing—review and editing, J.C.; visualization, J.C.; supervision, J.L.; project
administration, J.J.; funding acquisition, J.L. and J.J. All authors have read and agreed to the published
version of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (NNSFC)
under Grant 72001209, 72231011 and 72071206, and the Science Foundation for Outstanding Youth
Scholars of Hunan Province under Grant 2022JJ20047.
Institutional Review Board Statement: This article does not involve ethical research and does not
require ethical approval.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: The common dataset used in this study can be obtained from links https:
//www.aminer.cn/, http://www.lastfm.com and https://movielens.org/ (accessed on 14 June 2023).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Stanfield, Z.; Coşkun, M.; Koyutürk, M. Drug response prediction as a link prediction problem. Sci. Rep. 2017, 7, 40321. [CrossRef]
[PubMed]
2. Nasiri, E.; Berahm, K.; Rostami, M.; Dabiri, M. A novel link prediction algorithm for protein-protein interaction networks by
attributed graph embedding. Comput. Biol. Med. 2021, 137, 104772. [CrossRef] [PubMed]
3. Liben-Nowell, D.; Kleinberg, J. The link prediction problem for social networks. In Proceedings of the Twelfth International
Conference on Information and Knowledge Management, New Orleans, LA, USA, 9–11 November 2003; pp. 556–559.
4. Cho, H.; Yu, Y. Link prediction for interdisciplinary collaboration via co-authorship network. Soc. Netw. Anal. Min. 2018, 8, 1–12.
[CrossRef]
5. Liu, G. An ecommerce recommendation algorithm based on link prediction. Alex. Eng. J. 2022, 61, 905–910. [CrossRef]
6. Adamic, L.A.; Adar, E. Friends and neighbors on the web. Soc. Net. 2003, 25, 211–230. [CrossRef]
7. Zhou, T.; Lü; L.; Zhang, Y.C. Predicting missing links via local information. Eur. Phys. J. B 2009, 71, 623–630. [CrossRef]
8. Katz, L. A new status index derived from sociometric analysis. Psychometrika 1953, 18, 39–49. [CrossRef]
9. Leicht, E.A.; Holme, P.; Newman, M.E. Vertex similarity in networks. Phys. Rev. E 2006, 73, 026120. [CrossRef]
10. Fouss, F.; Pirotte, A.; Renders, J.M.; Saerens, M. Random-walk computation of similarities between nodes of a graph with
application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 2007, 19, 355–369. [CrossRef]
11. Jeh, G.; Widom, J. Simrank: A measure of structural-context similarity. In Proceedings of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23–26 July 2002; pp. 538–543.
12. Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710.
13. Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864.
14. Wang, X.; Cui, P.; Wang, J.; Pei, J.; Zhu, W.; Yang, S. Community preserving network embedding. In Proceedings of the AAAI
Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31.
15. Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q. Line: Large-scale information network embedding. In Proceedings of the
24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077.
16. Scarselli, F.; Gori, M.; Tsoi, A.C. The graph neural network model. IEEE Trans. Neural Net. 2008, 20, 61–80. [CrossRef]
17. Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. NIPS 2017, 30, 1024–1034.
18. Kipf, T.N.; Welling, M. Semi-supervised, classification, with, graph, convolutional, networks. arXiv 2017, arXiv:1609.02907.
19. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903.
20. Shi, C.; Wang, R.; Wang, X. Survey on Heterogeneous Information Networks Analysis and Applications. J. Softw. 2022, 33, 598–621.
21. Dong, Y.; Chawla, N.V.; Swami, A. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of
the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August
2017; pp. 135–144.
22. Fu, T.Y.; Lee, W.C.; Lei, Z. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017;
pp. 1797–1806.
23. Chang, S.; Han, W.; Tang, J.; Qi, G.J.; Aggarwal, C.C.; Huang, T.S. Heterogeneous network embedding via deep architectures. In
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia,
10–13 August 2015; pp. 119–128.
Mathematics 2023, 11, 3541 17 of 17
24. Zhang, C.; Song, D.; Huang, C.; Swami, A.; Chawla, N.V. Heterogeneous graph neural network. In Proceedings of the 25th ACM
SIGKDD International Conference on Knowledge Discovery Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 793–803.
25. Shi, C.; Hu, B.; Zhao, W.X.; Philip, S.Y. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl.
Data Eng. 2018, 31, 357–370. [CrossRef]
26. Zhang, R.; Wang, Q.; Yang, Q.; Wei, W. Temporal link prediction via adjusted sigmoid function and 2-simplex structure. Sci. Rep.
2022, 12, 16585. [CrossRef]
27. Zou, L.; Zhan, X.X.; Sun, J.; Hanjalic, A.; Wang, H. Temporal network prediction and interpretation. IEEE Trans. Netw. Sci. Eng.
2021, 9, 1215–1224. [CrossRef]
28. Zhou, L.; Yang, Y.; Ren, X.; Wu, F.; Zhuang, Y. Dynamic network embedding by modeling triadic closure process. In Proceedings
of the AAAI Conference on Artificial Intelligence, New Orleans, LO, USA, 2–7 February 2018; Volume 32.
29. Nguyen, G.H.; Lee, J.B.; Rossi, R.A.; Ahmed, N.K.; Koh, E.; Kim, S. Continuous-time dynamic network embeddings. In
Proceedings of the Companion Proceedings of the Web Conference 2018, Lyon, France, 23–27 April 2018; pp. 969–976.
30. Rahman, M.; Saha, T.K.; Hasan, M.A.; Xu, K.S.; Reddy, C.K. Dylink2vec: Effective feature representation for link prediction in
dynamic networks. arXiv 2018, arXiv:1804.05755.
31. Abbas, K.; Abbasi, A.; Dong, S.; Niu, L.; Chen, L.; Chen, B. A Novel Temporal Network-Embedding Algorithm for Link Prediction
in Dynamic Networks. Entropy 2023, 25, 257. [CrossRef]
32. Yin, Y.; Ji, L.X.; Zhang, J.P.; Pei, Y.L. DHNE: Network representation learning method for dynamic heterogeneous networks. IEEE
Access 2019, 7, 134782–134792. [CrossRef]
33. Wang, X.; Lu, Y.; Shi, C.; Wang, R.; Cui, P.; Mou, S. Dynamic heterogeneous information network embedding with meta-path
based proximity. IEEE Trans. Knowl. Data Eng. 2020, 34, 1117–1132. [CrossRef]
34. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need.
NIPS 2017, 30, 5998–6008.
35. Tang, J.; Zhang, J.; Yao, L.; Li, J.; Zhang, L.; Su, Z. Arnetminer: Extraction and mining of academic social networks. In Proceedings
of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27
August 2008; pp. 990–998.
36. Cantador, I.; Brusilovsky, P.; Kuflik, T. Second workshop on information heterogeneity and fusion in recommender systems
(HetRec2011). In Proceedings of the Fifth ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011;
pp. 387–388.
37. Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 1–19. [CrossRef]
38. Zhou, F.; Liu, L.; Zhang, K.; Trajcevski, G.; Wu, J.; Zhong, T. Deeplink: A deep learning approach for user identity linkage. In
Proceedings of the IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018;
pp. 1313–1321.
39. Ribeiro, L.F.; Saverese, P.H.; Figueiredo, D.R. struc2vec: Learning node representations from structural identity. In Proceedings of
the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August
2017; pp. 385–394.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.