1 s2.0 S0893608024001436 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Neural Networks 174 (2024) 106219

Contents lists available at ScienceDirect

Neural Networks
journal homepage: www.elsevier.com/locate/neunet

Full Length Article

An Inductive Reasoning Model based on Interpretable Logical Rules over


temporal knowledge graph
Xin Mei a ,1 , Libin Yang a ,1 , Zuowei Jiang a , Xiaoyan Cai a ,∗, Dehong Gao a ,∗, Junwei Han a ,
Shirui Pan b
a
Northwestern Polytechnical University, China
b Griffith University, Australia

ARTICLE INFO ABSTRACT

Keywords: Extrapolating future events based on historical information in temporal knowledge graphs (TKGs) holds
Temporal knowledge graph significant research value and practical applications. In this field, the methods currently utilized can be
Temporal logical rules classified as either embedding-based or logical rule-based. Embedding-based methods depend on learned
Inductive reasoning
entity and relation embeddings for prediction, but they suffer from the lack of interpretability due to the
Zero-shot reasoning
opaque reasoning process. On the other hand, logical rule-based methods face scalability challenges as they
heavily rely on predefined logical rules. To overcome these limitations, we propose a hybrid model that
combines embedding-based and logical rule-based methods to capture deep causal logic. Our model, called
the Inductive Reasoning Model based on Interpretable Logical Rule (ILR-IR), aims to provide interpretable
insights while effectively predicting future events in TKGs. ILR-IR delves into historical information, extracting
valuable insights from logical rules embedded within relations and interaction preferences between entities.
By considering both logical rules and interaction preferences, ILR-IR offers a comprehensive perspective for
predicting future events. In addition, we propose the incorporation of a one-class augmented matching loss
during optimization, which serves to enhance performance of the model during training. We evaluate ILR-
IR on multiple datasets, including ICEWS14, ICEWS0515, and ICEWS18. Experimental results demonstrate
that ILR-IR outperforms state-of-the-art baselines, showcasing its superior performance in TKG extrapolation
reasoning. Moreover, ILR-IR demonstrates remarkable generalization capabilities, even when applied to related
datasets that share a common relation vocabulary. This suggests that our proposed model exhibits robust
zero-shot reasoning abilities. For interested parties, we have made our code publicly available at https:
//github.com/mxadorable/ILR-IR.

1. Introduction Grishman, Wan, Wang, & Gondek, 2013; Zeng, Liu, Chen, & Zhao,
2015) and question answering (Luo, Lin, Luo, & Zhu, 2018; Yih,
As structured representations of human knowledge, knowledge Chang, He, & Gao, 2015). To address this issue, reasoning techniques
graphs (KGs) utilize triples (subject, relation, object) to depict events, have been developed to draw new conclusions from existing data and
with subjects and objects representing entities. Entities encompass var- predict missing events, offering a potential solution. While traditional
ious real-world objects and abstract concepts, while relations indicate knowledge graphs primarily consist of static events, yet there exists a
the connections between entities. KGs, such as DBPedia (Lehmann vast amount of event data available that exhibits temporal correlations,
et al., 2015), and ConceptNet (Speer, Chin, & Havasi, 2017), have indicating how entities interact and change over time. This has led to
garnered significant attention in both academic and industrial re- the emergence of temporal knowledge graphs (TKGs) (Boschee et al.,
search (Nickel, Murphy, Tresp, & Gabrilovich, 2015; Wang, Mao, Wang,
2015; Gottschalk & Demidova, 2018; Kejriwal et al., 2019), which
& Guo, 2017), and have found extensive utilization across numerous
integrate temporal attributes into the graph structure. TKGs build
real-world applications such as entity linking (Francis-Landau, Durrett,
upon the concept of static triples by introducing timestamps, thereby
& Klein, 2016; Hua, Zheng, & Zhou, 2015), relation extraction (Min,

∗ Corresponding authors.
E-mail addresses: meixin@mail.nwpu.edu.cn (X. Mei), libiny@nwpu.edu.cn (L. Yang), jiangzw@mail.nwpu.edu.cn (Z. Jiang), xiaoyanc@nwpu.edu.cn
(X. Cai), dehong.gdh@nwpu.edu.cn (D. Gao), jhan@nwpu.edu.cn (J. Han), s.pan@griffith.edu.au (S. Pan).
1
These authors have contributed equally to this work.

https://doi.org/10.1016/j.neunet.2024.106219
Received 15 July 2023; Received in revised form 22 February 2024; Accepted 27 February 2024
Available online 29 February 2024
0893-6080/© 2024 Elsevier Ltd. All rights reserved.
X. Mei et al. Neural Networks 174 (2024) 106219

transforming the structure into quadruples (subject, relation, object, the quadruple is scored according to similarity between the interac-
timestamp). In this format, timestamp denotes valid time period for tion preference at the current moment and the relation in the query
the corresponding static triple. When it comes to reasoning over TKGs, quadruple. To train our model, we incorporate two training tasks that
their complex temporal dynamics bring about greater complexity as encompass both a coarse-grained quadruple perspective and a fine-
compared to static KGs. grained rule perspective. We optimize our proposed adaptive logical
Reasoning on a temporal knowledge graph (TKG) is typically con- rule embedding model using a one-class augmented matching loss.
ducted in two primary scenarios: interpolation (Gardner, 1993; Lunardi Throughout inference, our model has the capability to dynamically ex-
et al., 2009) and extrapolation (Brezinski, 1982; Brezinski & Redivo- tract and acquire features related to relation paths by utilizing historical
Zaglia, 1991). For interpolation scenarios, given events that occurred information. It assesses confidence of the extracted rules, calculates
within a specific time interval [𝑡0 , 𝑡𝑇 ], the objective is to infer any interaction preference at the current moment, and makes predictions.
missing events that took place during that interval. On the other hand, Additionally, our trained model exhibits the capability to be applied
extrapolation reasoning involves predicting future missing events for to other datasets that possess a shared relation vocabulary, thereby
timestamps beyond 𝑡𝑇 . To accomplish this, extrapolation reasoning facilitating zero-shot reasoning.
leverages observed historical KGs to learn hidden connections between In conclusion, this research paper presents four notable contribu-
events and make predictions for future timestamps. This capability tions:
finds practical applications in various scenarios such as disaster re- (1) A novel inductive temporal knowledge graph reasoning model
lief (Signorini, Segre, & Polgreen, 2011), financial analysis (Bollen, is proposed, efficiently capturing intricate structures within Temporal
Mao, & Zeng, 2011), social media analysis and prediction (Muthiah Knowledge Graphs (TKGs) and extracting potential logical rules to
et al., 2015; Phillips, Dowling, Shaffer, Hodas, & Volkova, 2017), crisis enhance reasoning performance.
warning (Korkmaz et al., 2015). This paper specifically concentrates on (2) Both the logical rules implied in events and the interaction pref-
the task of extrapolation reasoning. erences among entities are considered for event extrapolation reasoning
In recent years, there has been significant research focused on simultaneously.
extrapolation reasoning over TKGs, resulting in impressive prediction (3) The trained model exhibits adaptability to other datasets with a
performance. There are typically two categories for these methods: shared relation vocabulary, enabling zero-shot reasoning.
embedding-based methods and logical rule-based methods. Embedding- (4) Experiments conducted on three benchmark datasets demon-
based methods such as TIE (Wu et al., 2021), RE-GAT (Li, Feng et al., strate superior performance compared to state-of-the-art baselines.
2023), and RPC (Liang et al., 2023), aim to utilize neural networks The rest of this paper is organized as follows. Section 2 provides an
to learn efficient embeddings of entities and relations. However, the
extensive review of related literature in the field. Section 3 presents
black-box property of embeddings results in a lack of interpretability.
the preliminaries. Section 4 introduces our proposed ILR-IR model.
To enhance credibility and utility, some researchers propose the use
Section 5 conducts experiments and evaluates the results. Conclusion
of logical rules for reasoning. Some methods (Bai, Chen, Zhu, Meng,
and discussion are presented in Section 6.
2023; Bai, Yu, Chai, Zhao and Chen, 2023) typically predefine types
of logical rules, such as Symmetric rule and Transitivity rule. Some
2. Related work
methods (Liu, Ma, Hildebrandt, Joblin, & Tresp, 2022; Omran, Wang,
& Wang, 2019) acquire an initial set of rules through random walks.
They then filter out high-scoring rules based on statistical measures 2.1. Static Knowledge Graph (KG) reasoning
and apply the learned rules for predictions during testing. Existing
approaches focus on selecting effective logical rules, but the number The primary focus of most static knowledge graph reasoning models
of selected rules from limited training data is finite. These methods is the representation learning of entities and relations. These models
cannot adaptively discover new rules and heavily rely on the chosen construct scoring functions based on learned embeddings to assess
rule evaluation measure. the confidence of triples. Scoring functions can be categorized into
To address the aforementioned challenges, we present an inductive distance-based and semantic matching-based methods. Distance-based
reasoning model based on interpretable logical rules (ILR-IR) over methods, such as TransE (Bordes, Usunier, Garcia-Duran, Weston, &
temporal knowledge graphs. It deeply mines historical information, Yakhnenko, 2013), TransH (Wang, Zhang, Feng, & Chen, 2014), and
explores logical rules contained in relations and interaction preferences TransR (Lin, Liu, Sun, Liu, & Zhu, 2015), utilize principles of transla-
between entities, and predicts future events from the perspectives of tion to compute the score of triples. While semantic matching-based
logical rules and interaction preferences. From the perspective of logic methods, exemplified by DistMult (Yang, Yih, He, Gao, & Deng, 2015),
rules, we design a rule-based relation path reasoning module. ILR-IR measure the semantic similarity between entity–relation pairs to de-
is designed to effectively capture the underlying structure of TKGs and termine the score. Furthermore, embedding models play a crucial role
discover latent logical rules. Our model regards sequences of relations in capturing the interactions between entities and relations. For in-
as logical rules, while entities serve as tools for extracting relation stance, ComplEx (Trouillon, Welbl, Riedel, Gaussier, & Bouchard, 2016)
paths. The mining process begins by identifying historical subgraphs in and DistMult (Yang et al., 2015) utilize linear operations to encode
TKG that are related to specific entities. From these subgraphs, relation these interactions. Additionally, RotatE (Sun, Deng, Nie, & Tang, 2019)
paths are extracted and embeddings of relation paths, which encode adopts a unique approach by treating relations as rotations in a complex
historical semantics, are learned. Next, we match these relational paths space, effectively mapping the subject entity to the object entity. More-
with the event to generate rules, and evaluate the confidence of these over, the recent advent of neural network-based models has brought
rules according to causal logic. Finally, the confidence of the rules significant advancements in KG embeddings. RSN (Guo, Sun, & Hu,
is used to score the quadruple, providing a measure of its reliability. 2019), for example, introduces a cycle skipping mechanism based
This module infers entirely based on logical rules, so it is not limited on recurrent neural networks (RNN) to enhance semantic representa-
by entities and can be easily transferred to datasets with the same tion, distinguishing entities and relations in a novel way. Similarly,
relation vocabulary. But it cannot distinguish quadruples with the same ConvE (Dettmers, Minervini, Stenetorp, & Riedel, 2018) and Con-
historical relation path for ignoring entity information. To this end, vKB (Nguyen, Nguyen, Nguyen, & Phung, 2018) leverage convolutional
we design an entity correlation reasoning module from the perspective neural networks (CNN) to model interaction features. In a different
of interaction preference, which fully considers interaction character- vein, models like R-GCN (Schlichtkrull et al., 2018), A2N (Bansal, Juan,
istics of entities. This module predicts interaction preferences between Ravi, & McCallum, 2019), and RGHAT (Zhang et al., 2020) employ
entities based on their interaction relations in historical events. And graph neural networks, aiming to capture structural features in KGs.

2
X. Mei et al. Neural Networks 174 (2024) 106219

Fig. 1. Rule extraction based on relation paths.

However, embedding-based methods often face challenges in in- 2021) utilizes copy module to capture repeating patterns in historical
terpretability and scalability in large-scale reasoning tasks. To ad- events. xERTE (Han et al., 2021) samples an inference subgraph based
dress these limitations, researchers have pivoted towards logical rule- on the query. RE-GCN (Li, Jin, Li et al., 2021) captures evolutional
based approaches for knowledge reasoning. These methods, including representations of entities and relations within fixed-length timestamps.
AMIE (Galárraga, Teflioudi, Hose, & Suchanek, 2013), IterE (Zhang CEN (Li et al., 2022) utilizes CNN to handle evolutional patterns
et al., 2019), pLog-icNet (Qu & Tang, 2019), and RuleGuider (Lei of varying lengths. Notably, HSAE (Ren et al., 2023) utilizes a self-
et al., 2020), focus on learning symbolic logical rules. They offer attention mechanism to capture structural information in TKG and
transparency and interpretability by composing relation paths in the diachronic embedding functions for investigating evolutionary repre-
reasoning process. In addition, some research has explored the appli- sentations. Moreover, recent developments like TFSC (Zhang & Bai,
cation of reinforcement learning techniques to identify reasoning paths 2023) and MetaTKG (Xia, Zhang, Liu, Wu, & Zhang, 2022) leverage
between entities (Das et al., 2018; Lin, Socher, & Xiong, 2018; Wang, meta-learning, with TFSC introducing a time-aware matching processor
and MetaTKG developing a temporal meta-learner. These methods
Li, Pan, & Mao, 2019; Xiong, Hoang, & Wang, 2017). However, it is
demonstrate proficiency in capturing complex features. However, they
important to note that these methods disregard temporal information,
predominantly depend on pre-trained embeddings. This reliance con-
thus overlooking temporal dynamics inherent in the data.
strains their inductive predictive capacity, especially for events with
novel relations, entities, or timestamps. To address these limitations,
2.2. Temporal Knowledge Graph (TKG) reasoning several methods have integrated temporal path reasoning with rein-
forcement learning, such as TiTer (Sun, Zhong, Ma, Han, & He, 2021),
TKG reasoning encompasses two distinct settings: extrapolation rea- CluSTeR (Li, Jin, Guan et al., 2021), and MUESEV (Guo et al., 2022).
soning and interpolation reasoning. For interpolation reasoning, re- These models learn effective policies through reinforcement learning
searchers have advanced static KG representation learning methods to identify appropriate reasoning paths but are constrained by their
by incorporating temporal information. This approach aims to fill in reliance on manually crafted rewards. RLAT (Bai, Chai and Zhu, 2023)
missing events from previous timestamps, thereby enhancing the over- and DREAM (Zheng et al., 2023), for instance, combine attention mech-
all understanding of the KG. For instance, TTransE (Leblay & Chekol, anisms with reinforcement learning to focus on frequently occurring
2018) extends the TransE model by integrating temporal information relations for multi-hop reasoning paths.
into its scoring function. Similarly, HyTE (Dasgupta, Ray, & Talukdar, In the realm of interpretability, logical rule-based methods like
2018) improves upon the TransH model by including a timestamp AnyBURL (Meilicke, Chekol, Fink, & Stuckenschmidt, 2020), Stream-
feature in the hyperplane projection’s normal vector. Another method, Learner (Omran et al., 2019), and TLogic (Liu et al., 2022) have
TA-DistMult (Garcia-Duran, Dumančić, & Niepert, 2018), leverages the emerged. These methods employ random walks to mine candidate logi-
sequential nature of recurrent networks to effectively capture temporal cal rules, enhancing transparency in the reasoning process. In contrast,
dependencies within KGs. Furthermore, TASTER (Wang, Lyu, Wang, TPRG (Bai, Chai, 2023) and TLmod (Bai, Yu et al., 2023) manually
Wu, & Chen, 2023) and BiQCap (Zhang, Liang et al., 2023) investigate design some logical rules. TPRG (Bai, Chai, 2023) manually defines
the temporal evolution of entities, with the former integrating both fourteen temporal logical rules, which capture various types of logical
global and local timestamp information and the latter representing relationships between entities. TLmod (Bai, Yu et al., 2023) introduces
entities as translations and relations as combinations of Euclidean and five categories of temporal logical rules and proposes a pruning strategy
to acquire rules and calculate rule confidence scores. While these
hyperbolic rotations in biquaternion space. Additionally, TBDRI (Yu
methods are advantageous for their applicability across datasets with
et al., 2023) and CTRIEJ (Li, Wang et al., 2023) introduce novel mech-
the same relation vocabulary, the quality of learned rules heavily
anisms for learning inverse relations and generating negative samples,
depends on the choice of evaluation measures. Additionally, these
respectively. TAL-TKGC (Nie et al., 2023), by introducing a temporal
methods, though adept at utilizing learned rules for reasoning, often
attention module, captures inherent correlations between timestamps
lack adaptability to new logical rule patterns. Our previous work (Mei,
and entities, and incorporates a weighted GCN for exploring global Yang, Cai, & Jiang, 2022) proposes an adaptive logic rules embed-
structural information within TKGs. ding model, it ignores temporal information when mining logic rules,
Extrapolation reasoning, in contrast to interpolation reasoning, in- and cannot capture long-term dependencies between events. Aiming
volves predicting future events based on historical facts. Several meth- at this problem, our proposed ILR-IR introduces relative temporal
ods approach TKG as a static graph sequence and learn entity and encoding and combines temporal features to capture more effective
relation representations on each sub-graph. Examples of such methods logic rules. When mining temporal logic rules, all relevant historical
include RE-NET (Jin, Qu, Jin, & Ren, 2020), CyGNet (Zhu, Chen, Fan, events are fully considered without limitation by time intervals, so
Cheng, & Zhang, 2021), xERTE (Han, Chen, Ma, & Tresp, 2021), RE- as to help ILR-IR explore the deep dependencies between events with
GCN (Li, Jin, Li et al., 2021), CEN (Li et al., 2022) and HSAE (Ren, large time intervals. Furthermore, this work proposes to capture inter-
Bai, Xiao, & Meng, 2023). RE-NET (Jin et al., 2020) utilizes RNN action preferences between entities, combining temporal logic rules and
to learn embeddings of relations and entities. CyGNet (Zhu et al., interaction preferences for more accurate reasoning.

3
X. Mei et al. Neural Networks 174 (2024) 106219

Fig. 2. Architecture of the proposed model.

3. Preliminaries among relations and predicting interaction preferences between enti-


ties. From the perspective of logic rules, we design a relation path
Temporal Knowledge Graph (TKG). In a TKG, dynamic events reasoning module, which digs out the deep absolute causal relation-
are represented as quadruples (𝑠, 𝑟, 𝑜, 𝑡), where each event consists of ship according to relation paths. From the perspective of interaction
a subject entity 𝑠 ∈ 𝐸, a relation 𝑟 ∈ 𝛶 , an object entity 𝑜 ∈ 𝐸 and preference, we design an entity correlation reasoning module to in-
a timestamp 𝑡 ∈ 𝛤 . Here, 𝐸 represents the entity set, 𝛶 represents the fer relationship between entities based on their historical interaction
relation set, and 𝛤 represents the set of timestamps. information.
Entity Prediction. Entity prediction involves inferring the missing Our inductive reasoning model comprises three main components
component of a temporal quadruple (event), such as predict object as follows: Rule based relation path reasoning, which mines tem-
entity given (𝑠, 𝑟, ?, 𝑡) or predict subject entity given (?, 𝑟, 𝑜, 𝑡). The goal is poral logic rules in historical events and makes predictions based on
to ensure that the correct quadruple achieves a higher score compared causality. Entity correlation reasoning, which predicts the interaction
to incorrect quadruples. Typically, this objective function is defined as tendency between entities according to their historical relations. Train-
𝑔(𝑠, 𝑟, 𝑜, 𝑡) ∈ 𝑅. ing, which optimize the model using a one-class augmented matching
Temporal logical rules. We explore and identify temporal logic loss, enabling it to adaptively learn logical rules that are considered
rules by mining causal relationships between events. In Fig. 1, we reasonable. The overall flow of our proposed model is illustrated in
illustrate this process using an example event (𝑒1 , 𝑟8 , 𝑒5 , 𝑡). By examining Fig. 2. In the following, we will elaborate each part of the model in
the historical events that have occurred, we extract temporal logical detail.
rules contained within this event. Fig. 1(a) depicts the subgraph that
encompasses events happening within the time interval [0, 𝑡 − 1]. Ac- 4.1. Rule based relation path reasoning
cording to this subgraph, we explore and identify all possible rules that
could potentially contribute to the occurrence of the current event. In To learn complex logic rules contained in TKGs, we propose a
Fig. 1(b), we observe all the paths from 𝑒1 to 𝑒5 and extract relations, logical rule embedding module. By utilizing relation paths, the model
resulting in four potential logic rules shown in Fig. 1(c). Each rule, represents logical rules that are implicitly present in TKGs. Addition-
denoted as 𝑅𝑡 (𝑝, 𝑟), consists of a path 𝑝 ∈ 𝑃(𝑠,𝑜) 𝑡 representing the ally, it learns embeddings of relation paths to capture the intricate
historical cause and a relation 𝑟 ∈ 𝛶 indicating the current result at semantic features of TKGs. The model possesses the ability to extract
the given timestamp. potential rules in an adaptive manner, acquire embeddings for these
rules, evaluate the rules using interpretable causal logic. Finally, it
4. Interpretable temporal knowledge graph reasoning method utilizes relative confidence of the rules to make predictions.
In this part, we extract relation logical rules from historical sub-
Among existing TKG reasoning models, logical rule based models graphs, acquire embeddings of relation paths, and utilize the learned
stand out for their interpretability and their ability to be transferred rules to make predictions.
to unseen datasets with a shared relation vocabulary. But logical rules
they learn depend on selected statistics-based measures and cannot 4.1.1. Relation paths extraction
adapt to new rule patterns. In view of this, we combine embedding To construct a relation graph based on events occurring before time
vectors that can reflect deep features with interpretable logical rules, 𝑡, we consider entities as nodes and relations as edges. For each event
and propose an inductive reasoning model based on deep logical rule (𝑠, 𝑟, 𝑜, 𝑡), we extract two subgraphs by considering the 𝑘-hop neighbors
mining. We develop model from two perspectives, mining logic rules around node 𝑠 and node 𝑜, respectively. These subgraphs are then

4
X. Mei et al. Neural Networks 174 (2024) 106219

intersected to obtain a merged subgraph, i.e., retaining common nodes where the input vector at timestamp 𝑡 is denoted as 𝐱𝑡 , while the hidden
and removing all others. By doing this, independent nodes and nodes states at timestamp 𝑡 are represented by 𝐡𝑡 . 𝐖 and 𝐔 are trainable
that are more than 𝑘 hops away from either node 𝑠 or node 𝑜 are weight parameters, and 𝑓 (⋅) represents an activation function.
removed. This process ensures that the resulting subgraph contains all For relation paths of variable lengths, the GRU architecture is well-
paths between node 𝑠 and node 𝑜 with a maximum length of 2𝑘. Finally, suited to process these sequences and generate embeddings for each
we perform random walks on the subgraph to get all candidate paths path. GRU accomplishes this by regulating the flow of information
between node 𝑠 and node 𝑜 with length no greater than 𝑘. Constructing through two learnable gates: the update gate and the reset gate. The
such subgraphs helps reduce the cost of random walks. update gate determines which historical memory information should be
preserved, while the reset gate determines which information should be
4.1.2. Relative temporal encoding forgotten. The specific formula for the GRU model is as follows:
Traditional embedding-based methods divide temporal knowledge 𝐳𝑡 = 𝜎(𝐖𝑧 ⋅ 𝐱𝑡 + 𝐔𝑧 ⋅ 𝐡𝑡−1 + 𝐛𝑧 ) (5)
graph into multiple subgraphs according to the time of events, and
learn evolutional embeddings of entities and relations over time to 𝐫𝑡 = 𝜎(𝐖𝑟 ⋅ 𝐱𝑡 + 𝐔𝑟 ⋅ 𝐡𝑡−1 + 𝐛𝑟 ) (6)
capture temporal information. These methods ignore dependencies be- 𝐡̃𝑡 = 𝑡𝑎𝑛ℎ(𝐖 ⋅ 𝐱𝑡 + 𝐔 ⋅ (𝐫𝑡 ⊙ 𝐡𝑡−1 ) + 𝐛) (7)
tween events within different timestamps. We capture temporal cor-
𝐡𝑡 = 𝐳𝑟 ⊙ 𝐡̃𝑡 + (1 − 𝐳𝑟 ) ⊙ 𝐡𝑡−1 (8)
relation by encoding relative time interval between events, aiming to
obtain efficient temporal logical rules. where 𝐳𝑡 represents the update gate and 𝐫𝑡 represents the reset gate. The
From the perspective of logical rules, result of the current event hidden state at timestamp 𝑡 − 1, denoted as 𝐡(𝑡−1) , serves as the neural
may appear after 𝛥𝑡 timestamps. Therefore, instead of encoding ab- network memory and holds information from the previous input. The
solute temporal information, we encode the time interval 𝛥𝑡 between sigmoid function is denoted as 𝜎.
two events that may be causally related. By encoding time interval
information, the model can effectively capture dynamic dependencies 4.1.4. Confidence estimation of rules
between different events. We adopt Sinusoidal encoding, which is used The relation paths extracted from events (𝑠, 𝑟, 𝑜, 𝑡) represent potential
in transformer (Vaswani et al., 2017) to capture dependencies between ‘‘reasons’’ for those events. Based on the resulting relation (𝑟), we search
words at different positions. For the event (𝑠, 𝑟, 𝑜, 𝑡), the relation path for plausible ones among these reasons, essentially identifying possible
(𝑟1 , 𝑡𝑖 )|(𝑟2 , 𝑡𝑗 ) is extracted, which consists of two events that occur at matching rules.
time 𝑡𝑖 and 𝑡𝑗 , and the two corresponding time intervals at time 𝑡 are For an event (𝑠, 𝑟, 𝑜, 𝑡), we retrieve all relation paths between the
𝛥𝑡𝑖 = 𝑡−𝑡𝑖 and 𝛥𝑡𝑗 = 𝑡−𝑡𝑗 , respectively. Taking 𝛥𝑡𝑖 = 𝑡−𝑡𝑖 as an example, subject entity 𝑠 and object entity 𝑜, along with their corresponding
the time interval is encoded as follows: embeddings 𝐏𝑡(𝑠,𝑜) . Considering a specific path 𝑝𝑖 ∈ 𝑃(𝑠,𝑜)
𝑡 as the ‘‘reasons’’
and relation 𝑟 as the ‘‘result’’, we obtain a rule 𝑅𝑡 (𝑝, 𝑟). To estimate
𝑅𝑇 𝐸 (𝛥𝑡,2𝑖) = 𝑠𝑖𝑛(𝛥𝑡∕100002𝑖∕𝑑 ) (1) the confidence of the rule, we capture the interaction between path 𝑝𝑖
and relation 𝑟. We define a confidence estimation function based on
𝑅𝑇 𝐸 (𝛥𝑡,2𝑖+1) = 𝑐𝑜𝑠(𝛥𝑡∕100002𝑖∕𝑑 ) (2) similarity matching as follows:

where 𝑑 is the dimension of the embedding. This function can charac- 𝑓 (𝑝𝑖 , 𝑟) = 𝑐𝑜𝑠(𝐩𝑖 , 𝐫) (9)
terize relative time difference of different time intervals at any moment.
where 𝑐𝑜𝑠 denotes the cosine similarity. This function calculates the
𝑅𝑇 𝐸 𝛥𝑡+𝑘 can be linearly represented by 𝑅𝑇 𝐸 𝛥𝑡 , so an efficient vector
interaction score between the path and relation by evaluating their
representation of any unseen time interval can be obtained.
cosine similarity.

4.1.3. Relation paths embedding 4.2. Entity correlation reasoning


To capture the logical semantics expressed in the relation paths, we
train and learn embeddings of relation path sequences. These embed- The rule-based relation path reasoning module mines relation paths
dings effectively capture the temporal and spatial logical correlations with entities as the medium and learns reasonable rules according to
between two entities. the relational paths. Since the characteristics of entities are ignored,
A relation path consists of a set of interactive relations attached with the trained model can be transferred to any unseen dataset with shared
temporal information. We first fuse temporal features with semantic relation vocabulary, regardless of unseen entities. However, if the same
features of relations to obtain the input of the relation path embedding optimal relation paths are mined from two quadruples, then they will
module. get equal prediction scores and will be in-distinguishable. Therefore,
we design an entity correlation reasoning module that captures entity
𝐫 = 𝐫 0 + 𝑅𝑇 𝐸 𝛥𝑡 (3)
features to improve prediction performance of the model.
where 𝐫 ∈ 𝑅𝑑 is the embedding of relation 𝑟, generated by random Existing temporal knowledge graph reasoning methods capture
initialization. properties of entities by learning unique feature representations for
To capture the features of relation paths, we utilize the Gated Re- each entity, and make predictions based on their embeddings. This
current Unit (GRU), which is a widely-used variant of Recurrent Neural results in a poorly transferable model that cannot handle quadruples
Networks (RNNs). Gated RNNs have demonstrated effectiveness in han- containing entities which do not exist in the training set. In order to
dling data with sequential characteristics, such as temporal and logical retain good transferability of our model and improve reasoning perfor-
orderings. RNNs excel at capturing relationships between sequential mance with the help of entity features, we mine interaction preference
data and extracting both sequential and semantic information present features between entities according to their interactive relations.
in the data. The strength of RNNs lies in their ability to remember Each entity has some interactions with other entities in histori-
information at each time step. The hidden state at any given time step is cal events, and these interactions can reflect interaction preferences
not only determined by the current input, but also by the output of the of entities. We measure interaction preference in terms of semantic
hidden state from the previous time step. A simplified representation preference and interaction frequency, that is, predicting current events
based on the historical interaction relationship between two entities
of a recurrent unit is as follows:
and their recent interaction frequency. For example, two countries that
𝐡𝑡 = 𝑓 (𝐖 ⋅ 𝐱𝑡 + 𝐔 ⋅ 𝐡𝑡−1 + 𝐛) (4) have frequently conducted economic and trade cooperation recently

5
X. Mei et al. Neural Networks 174 (2024) 106219

are more likely to continue interactive relations with such preferences.


For quadruple (𝑠, 𝑟, 𝑜, 𝑡), we mine the events containing entity 𝑠 and
entity 𝑜 within previous 𝑚 timestamps, extracting relations among
these events. The sum of these relation embeddings is the semantic
preference embedding between 𝑠 and 𝑜 at timestamp 𝑡. Regarding the
interaction frequency, we first count the frequency 𝐹 corresponding to
the candidate entity that interacts with subject entity 𝑠 most frequently
within the previous 𝑚 timestamps. The interaction preference can
then be obtained by dividing semantic preference embedding by the
frequency 𝐹 .
1 ∑
𝑡
𝐫̃ (𝑠,𝑜) = 𝐫𝑖 (10)
𝐹 𝑡−𝑚∶𝑡−1
𝑟𝑖 ∈𝑅(𝑠,𝑜)

where 𝑡−𝑚∶𝑡−1
𝑅(𝑠,𝑜) is all the interactive relations between 𝑠 and 𝑜 from 𝑡−𝑚
to 𝑡 − 1.

4.3. Score function

In this section, we combine the mined logical rules and interaction


preferences of entities to score the quadruple. Fig. 3. Training from quadruple perspective.
From a logical rules perspective, we aim to predict the missing
quadruple (𝑠, 𝑟, ?, 𝑡) by leveraging the learned path embeddings. By
extracting all relational paths between the subject entity 𝑠 and the
object entity 𝑜, we identify potential causal factors for the current
event. These paths, when combined with relation 𝑟, contribute to the
generation of a set of rules. The rule with the highest confidence
serves as an indicator of the most plausible ‘‘reason’’ for the current
quadruple. The confidence of this rule is then utilized as the score
for the quadruple. For a candidate target entity 𝑜, the corresponding
quadruple is (𝑠, 𝑟, 𝑜, 𝑡). The score function in this case is defined as
follows: Fig. 4. Training from rule perspective.

𝑔𝑟 (𝑠, 𝑟, 𝑜, 𝑡) = 𝑚𝑎𝑥𝐩𝑖 ∈𝐏𝑡 𝑓 (𝐩𝑖 , 𝑟) (11)


(𝑠,𝑜)

From the perspective of interaction preference, we score the quadru- where 𝑄 is the set of valid quadruples, and 𝑄′ denotes the set of invalid
𝑡
ple according to the similarity of interaction preference 𝐫̃ (𝑠,𝑜) with triples as 𝑄′ = {(𝑠, 𝑟, 𝑜′ , 𝑡)|𝑜′ ∈ 𝐸 − 𝑜}.
relation 𝑟. Score of a quadruple is determined by the rule with highest confi-
𝑡
𝑔𝑒 (𝑠, 𝑟, 𝑜, 𝑡) = 𝑐𝑜𝑠(̃𝐫(𝑠,𝑜) , 𝐫) (12) dence it contains. It has been observed that all the rules derived from
an incorrect quadruple are also incorrect. However, determining which
The overall score function is defined as: rules derived from a correct quadruple are actually correct is difficult.
As illustrated in Fig. 3(a), the training task aims to increase the score
ℎ(𝑠, 𝑟, 𝑜, 𝑡) = 𝑔𝑒 + 𝑔𝑟 (13)
of the correct quadruple. In other words, the soft positive rule with
the highest score is considered a positive example, leading to higher
4.4. Training
confidence. Similarly, for the incorrect quadruple, the hard negative
rule with the highest score is treated as a negative example, resulting
In this section, we present the optimization objective for training
in lower confidence. From the perspective of matching the reason path
the model. Our goal is to identify matching paths in historical events
with the result relation 𝑟, this task focuses on aligning the path in the
(correct rules) for a given relation 𝑟. However, the main challenge
positive example closely with the relation 𝑟, while moving the path in
is that we do not have prior knowledge about which path-relation
the negative example away from the relation 𝑟, as depicted in Fig. 3(b).
pairs are matched, meaning there is no definitive correct rule available
In summary, this training task is designed to enhance the confidence of
for training. To address this challenge, we design training tasks from
both a coarse-grained quadruple perspective and a fine-grained rule soft positive rules, reduce the confidence of hard negative rules that are
perspective. We introduce a one-class augmented matching loss to prone to misjudgment, and disregard other negative rules and uncertain
guide the training process. rules.

4.4.1. Training from quadruple perspective 4.4.2. Training from rule perspective
From the quadruple perspective, similar to embedding-based meth- From the perspective of fine-grained rules, we introduce an addi-
ods, we define a main loss function based on the quadruple score. The tional auxiliary training task to address the rules that are overlooked in
objective of this loss function is to encourage higher scores for correct the main task. Inspired by the concept of one-class problems, we focus
quadruples and lower scores for incorrect ones. The specific form of the on training the model solely with negative samples. Through negative
loss function is a soft-margin loss, given by: sampling, we generate an ample number of negative quadruplets, each
∑ consisting of multiple negative rules that can be definitively identified
𝐿1 = 𝑙𝑜𝑔(1 + 𝑒𝑥𝑝(−𝑙 ⋅ ℎ(𝑠, 𝑟, 𝑜, 𝑡))) (14) as negative. As depicted in Fig. 4, by reducing the confidence of
⋃ ′
(𝑠,𝑟,𝑜,𝑡)∈𝑄 𝑄
{ negative rules, we enhance the relative confidence of potential positive
1, (𝑠, 𝑟, 𝑜, 𝑡) ∈ 𝑄 rules. This approach ultimately improves the predictive accuracy of the
𝑙= (15)
−1, (𝑠, 𝑟, 𝑜, 𝑡) ∈ 𝑄′ model.

6
X. Mei et al. Neural Networks 174 (2024) 106219

Table 1 • ComplEX (Trouillon et al., 2016) introduces complex numbers to


Statistics of the datasets.
extend DistMult to better model asymmetric relationships.
Data Entities Relations Training Validation Test Time
• AnyBURL (Meilicke et al., 2020) proposes an anytime bottom-up
granules
rule learning algorithm and applies the learned rules to reason in
ICEWS14 7128 230 63,685 13,823 13,222 365
KGs.
ICEWS18 23,033 256 539,286 67,538 63,110 304
ICEWS0515 10,488 251 272,115 17,535 20,466 4017
5.2.2. Temporal models
• TTransE (Leblay & Chekol, 2018) extends embedding model
TransE (Bordes et al., 2013) by replacing its scoring function.
Loss function is defined as the cosine loss:
• DE-SimplE (Goel, Kazemi, Brubaker, & Poupart, 2020) enhances

𝐿2 = 𝑐𝑜𝑠𝑙𝑜𝑠𝑠(𝑝, 𝑟) (16) the static model by introducing a diachronic entity embedding
⋃ ′
(𝑠,𝑟,𝑜,𝑡)∈𝑄 𝑄 function, which enables the model to incorporate temporal infor-
{ mation and provide entity features at any given point in time.
1 − 𝑐𝑜𝑠(𝐩, 𝐫), 𝑦=1
𝑐𝑜𝑠𝑙𝑜𝑠𝑠(𝑝, 𝑟) = (17) • TNTComplEx (Lacroix, Obozinski, & Usunier, 2020) introduces
𝑚𝑎𝑥(0, 𝑐𝑜𝑠(𝐩, 𝐫)), 𝑦 = −1
new regularizers to extend ComplEx (Trouillon et al., 2016).
Then the overall one-class augmented matching loss is defined as: • TA-DistMult (Garcia-Duran et al., 2018) incorporates temporal in-
𝐿 = 𝐿1 + 𝐿2 (18) formation into DistMult (Yang et al., 2015) and utilizes recurrent
neural networks to learn time-aware representations of relations.
• RE-NET (Jin et al., 2020) encodes global graph structure and local
4.5. Inference
neighborhoods, capturing global and local information, respec-
tively.
We can utilize the trained model directly to extract historical fea-
• CyGNet (Zhu et al., 2021) utilizes copy module to capture repeat-
tures and predict missing entities without the need for a complex
ing patterns.
rule application process. Initially, we identify candidate entities by
• xERTE (Han et al., 2021) reasons on correlated subgraph of
considering all entities that can be reached within a certain number
query and jointly models structural dependencies and temporal
of hops (𝑘) from the source entity node (𝑠). From these candidates, we
dynamics.
generate potential quadruples according to the candidate entities. Next,
• TiTer (Sun et al., 2021) is the first reinforcement learning method
we employ the trained encoder to extract relation paths specific to the
for TKG reasoning. It defines a novel time-shaped reward based
query (𝑠, 𝑟, ?, 𝑡), enabling us to formulate rules and evaluate their con-
on Dirichlet distribution.
fidence levels. Additionally, we extract interaction relations between
• MUESEV (Guo et al., 2022) is a temporal-path-based reinforce-
entity 𝑠 and the candidate entity 𝑜 within a defined time interval (𝑚)
ment learning model.
in the past, and determine the interaction preference between the two
• RE-GCN (Li, Jin, Li et al., 2021) combines graph convolutional
entities at the present moment. Finally, we score the quadruples based
networks and recurrent evolution networks to recurrently model
on both the confidence of the rules and the similarity between relation
KG sequences to learn evolutional representations of entities and
𝑟 and the interaction preference. The candidate entity associated with
relations.
the quadruple that receives the highest score represents the predicted
• CEN (Li et al., 2022) employs a length-aware CNN to handle
target entity.
evolutional patterns with varying lengths, employing an easy-to-
difficult curriculum learning strategy.
5. Experiment
• TLogic (Liu et al., 2022) is an explainable framework that utilizes
temporal random walks to extract temporal logical rules.
5.1. Datasets
• CluSTeR (Li, Jin, Guan et al., 2021) employs a two-stage approach
consisting of clue searching and temporal reasoning to predict
Our experiments were performed on the Integrated Crisis Early future facts.
Warning System 2 (ICEWS) dataset, a well-known dataset for temporal • RPC (Liang et al., 2023) proposes Relational Correspondence Unit
knowledge graph link prediction that consists of international event (RCU) and Periodic Correspondence Unit (PCU). They are used to
information. We specifically focused on three subsets within the ICEWS extract information related to relational correlations and periodic
dataset: ICEWS0515, covering data from 2005 to 2015; ICEWS14, patterns.
containing data from 2014; and ICEWS18, comprising data from 2018. • RE-GAT (Li, Feng et al., 2023) utilizes an attention-based histor-
To ensure a fair comparison, we follow the data splits provided by Ref. ical events embedding module for encoding past events. It incor-
Liu et al. (2022) and divide each dataset into training, validation, and porates an attention-based concurrent events embedding module
test sets. Table 1 presents detailed statistics for all the datasets used in to capture the relationships between events occurring at the same
our study. timestamp.
• MetaTKG (Xia et al., 2022) treats TKG prediction as multiple
5.2. Baselines temporal meta-tasks and employs the designed Temporal Meta-
learner to acquire evolutionary meta-knowledge from these meta-
To showcase the effectiveness of our proposed ILR-IR model, we tasks.
conduct a comprehensive comparison of experimental results against • TECHS (Lin, Liu, Mao, Xu, & Cambria, 2023) utilizes a graph
a diverse range of static models and temporal models. convolutional network with temporal encoding and heteroge-
neous attention to embed both topological structures and tempo-
5.2.1. Static models ral dynamics. Then, it integrates propositional reasoning and first-
• DistMult (Yang et al., 2015) is an embedding-based method that order reasoning by introducing a reasoning graph that iteratively
restricts the relation matrix to a diagonal matrix. expands to identify the answer.
• RE-H2 AN (Guo, Chen, Zhang, Huang, & Liu, 2023) develops a
network that uses a hierarchical hypergraph to model evolving
2
https://dataverse.harvard.edu/dataverse/icews. patterns in entities across different semantic levels.

7
X. Mei et al. Neural Networks 174 (2024) 106219

Table 2
Performance comparison for entity prediction.
Method ICEWS14 ICEWS18 ICEWS0515
MRR Hits@1 Hits@3 Hits@10 MRR Hits@1 Hits@3 Hits@10 MRR Hits@1 Hits@3 Hits@10
DistMult(2015) 0.2767 0.1816 0.3115 0.4696 0.1017 0.0452 0.1033 0.2125 0.2873 0.1933 0.3219 0.4754
ComplEx(2016) 0.3084 0.2151 0.3448 0.4958 0.2101 0.1187 0.2347 0.3987 0.3169 0.2144 0.3574 0.5204
AnyBURL(2020) 0.2967 0.2126 0.3333 0.4673 0.2277 0.1510 0.2544 0.3891 0.3205 0.2372 0.3545 0.5046
TTransE(2018) 0.1343 0.0311 0.1732 0.3455 0.0831 0.0192 0.0856 0.2189 0.1571 0.0500 0.1972 0.3802
TA-DistMult(2018) 0.2647 0.1709 0.3022 0.4541 0.1675 0.0861 0.1841 0.3359 0.2431 0.1458 0.2792 0.4421
DE-SimplE(2020) 0.3267 0.2443 0.3569 0.4911 0.1930 0.1153 0.2186 0.3480 0.3502 0.2591 0.3899 0.5275
TNTComplEx(2020) 0.3212 0.2335 0.3603 0.4913 0.2123 0.1328 0.2402 0.3691 0.2754 0.1952 0.3080 0.4286
CyGNet(2021) 0.3273 0.2369 0.3631 0.5067 0.2493 0.1590 0.2828 0.4261 0.3497 0.2567 0.3909 0.5294
RE-NET(2020) 0.3828 0.2868 0.4134 0.5452 0.2881 0.1905 0.3244 0.4751 0.4297 0.3126 0.4685 0.6347
xERTE(2021) 0.4079 0.3270 0.4567 0.5730 0.2931 0.2103 0.3351 0.4648 0.4662 0.3784 0.5231 0.6392
TAL-TKGC(2023) 0.3120 0.2150 0.4900 0.6900 – – – – 0.3540 0.2260 0.4930 0.6920
TITer(2021) 0.4173 0.3274 0.4646 0.5844 0.2998 0.2205 0.3346 0.4483 – – – –
MUESEV(2022) 0.4183 0.3276 0.4647 0.5863 0.3001 0.2205 0.3346 0.4488 – – – –
FITCARL(2023) 0.4180 0.2840 0.5220 0.6810 0.2970 0.1560 0.3860 0.5840 0.3450 0.2020 0.4820 0.7320
CluSTeR(2021) 0.4600 0.3380 – 0.7120 0.3230 0.2060 – 0.5590 – – – –
RE-GAT(2023) 0.4069 0.2978 0.4588 0.6209 0.2979 0.1931 0.3385 0.5045 0.4665 0.3524 0.5304 0.6833
RE-H2 AN(2023) 0.4111 0.3116 0.4575 0.6034 0.2997 0.1964 0.3404 0.5064 – – – –
RE-GCN(2021) 0.4150 0.3086 0.4660 0.6247 0.3055 0.2000 0.3473 0.5146 0.4641 0.3517 0.5276 0.6764
CEN(2022) 0.4240 0.3208 0.4746 0.6131 0.3105 0.2170 0.3544 0.5059 – – – –
MetaTKG(2022) 0.4276 0.3208 – 0.6332 0.3160 0.2085 – 0.5279 0.5000 0.3817 – 0.7185
TLogic(2022) 0.4304 0.3356 0.4827 0.6123 0.2982 0.2054 0.3395 0.4853 0.4697 0.3621 0.5313 0.6743
TECHS(2023) 0.4388 0.3459 0.4936 0.6195 0.3085 0.2181 0.3539 0.4982 0.4838 0.3834 0.5469 0.6892
HGLS(2023) 0.4700 0.3506 – 0.7041 0.2932 0.1921 – 0.4983 0.4621 0.3532 – 0.6712
RPC(2023) 0.4455 0.3487 0.4980 0.6508 0.3491 0.2434 0.3874 0.5589 0.5114 0.3947 0.5711 0.7175
ILR-IR 0.5242 0.3823 0.6042 0.8043 0.3594 0.2216 0.4205 0.6426 0.5864 0.4672 0.6618 0.8039

Table 3 the validation loss did not decrease for 10 consecutive epochs. When
The Friedman Test is conducted based on the metrics of MRR, Hits@1 and Hits@10, conducting experiments, our model is implemented in PyTorch, and
with confidence interval of 0.95.
trained on a single GPU of NVIDIA Tesla V100 with 32G memory.
Metric p-value Final hypothesis
In order to avoid difficulty of training due to too many extracted
MRR 0.0024 Reject H0 paths, we follow the principle of ‘‘near and short’’ when extracting
Hits@1 0.0024 Reject H0
paths. Suppose that in the subgraph sampled at time 𝑡, the length of
Hits@10 0.0057 Reject H0
the shortest path between entity pairs (𝑠, 𝑜) is 𝑙𝑠 . We first extract all
paths with length 𝑙𝑠 , calculate the time difference between time of each
relation in path and time 𝑡, and take the maximum time difference as
• FITCARL (Ding, Wu, Li, Ma, & Tresp, 2023) merges few-shot and the relative time of path. The minimum relative time of all candidate
reinforcement learning, employing a policy network to navigate paths is 𝛿𝑡𝑠 . For paths whose length is greater than 𝑙𝑠 , only the path
the Temporal Knowledge Graph (TKG) for predictions. whose relative time is less than 𝑡 is extracted.
• TAL-TKGC (Nie et al., 2023) introduces a temporal attention We employ a time-aware filtering strategy (Han et al., 2021) to filter
mechanism and an importance-weighted Graph Convolutional out the quadruples that are valid at the current timestamp from the pool
Network (GCN) to analyze the semantic connections and struc- of candidate negative quadruples. During the rule extraction phase,
tural significance in TKGs. we treat TKG as an undirected graph. To evaluate the performance of
• HGLS (Zhang, Xia, Liu, Wu and Wang, 2023) applies a Hierarchi- our model, we employ several metrics: Mean Reciprocal Rank (MRR),
cal Relational Graph Neural Network (HRGNN) to capture both Hit@1, Hit@3, and Hit@10.
long-term and short-term entity and relation representations for
TKG reasoning, analyzing both sub-graph and global-graph levels. 5.4. Results

For RE-GCN (Li, Jin, Li et al., 2021),TiTer (Sun et al., 2021), The experimental results are presented in Table 2. It is observed
MUESEV (Guo et al., 2022), CEN (Li et al., 2022), CluSTeR (Li, Jin, that all models achieve their highest performance on the ICEWS0515
Guan et al., 2021), RPC (Liang et al., 2023), RE-GAT (Li, Feng et al., dataset, followed by ICEWS14, and the lowest performance on
2023), MetaTKG (Xia et al., 2022), TECHS (Lin et al., 2023), RE- ICEWS18. Upon examining Table 1, we note that the ICEWS18 dataset
H2 AN (Guo et al., 2023), FITCARL (Ding et al., 2023), TAL-TKGC (Nie contains a large number of entities and events. Consequently, the
temporal knowledge graph (TKG) formed from this dataset becomes
et al., 2023) and HGLS (Zhang, Xia et al., 2023), we list the results
complex and dense, leading to increased noise during inference. Con-
reported in the corresponding original paper. And for other baselines,
versely, TKG constructed from the ICEWS0515 dataset is comparatively
we list the results reported in TLogic (Liu et al., 2022).
more manageable and easier to handle.
Among the comparison models, three static inference models,
5.3. Implementations namely DistMult, ComplEX, and AnyBURL, do not take temporal infor-
mation into account and consequently exhibit the worst performance.
In our experiments, we initialize the relation embeddings with a Similarly, interpolation inference models such as TTransE, DE-SimplE,
dimension of 200 using random initialization. The maximum length TNTComplEx, and TA-DistMult struggle as they are unable to handle
(𝑘) of the rules was set to 3, indicating that rules consisting of up to events occurring in future timestamps, resulting in poor performance.
three relations were considered. To optimize all model parameters, we RE-NET and CyGNet exhibit constraints in generating predictions
employ the ‘‘Adam optimizer’’ and set the initial learning rate to 0.001. for entities absent from the training dataset. Conversely, xERTE demon-
Early stopping was employed as a mechanism to prevent overfitting. strates superior efficacy relative to RE-NET and CyGNet, an enhance-
Specifically, we train the model for 200 epochs but stop training if ment ascribable to its proficiency in extracting historical subgraphs

8
X. Mei et al. Neural Networks 174 (2024) 106219

Table 4
Results by different variants of ILR-IR on all the datasets.
Method ICEWS14 ICEWS18 ICEWS0515
MRR Hits@1 Hits@3 Hits@10 MRR Hits@1 Hits@3 Hits@10 MRR Hits@1 Hits@3 Hits@10
ILR-IR 0.5242 0.3823 0.6042 0.8043 0.3594 0.2216 0.4205 0.6426 0.5864 0.4672 0.6618 0.8039
ILR-IR w/o RP 0.4944 0.3643 0.5648 0.7497 0.3269 0.2066 0.3885 0.5872 0.5559 0.4107 0.5970 0.7290
ILR-IR w/o EC 0.4544 0.3082 0.5251 0.7574 0.2860 0.1542 0.3427 0.5947 0.3256 0.2153 0.3987 0.5559

Table 5
Zero-shot reasoning where rules learned on train dataset are transferred and applied to test dataset.
Test ICEWS14 ICEWS0515 ICEWS18
Train ICEWS14 ICEWS0515 ICEWS18 ICEWS0515 ICEWS14 ICEWS18 ICEWS18 ICEWS14 ICEWS0515
MRR 0.5242 0.4938 0.4777 0.5864 0.5625 0.5674 0.3594 0.3288 0.3078
Hits@1 0.3823 0.3564 0.3387 0.4672 0.4430 0.4476 0.2216 0.1998 0.1829
Hits@3 0.6042 0.5396 0.5243 0.6618 0.6343 0.6402 0.4205 0.3972 0.3669
Hits@10 0.8043 0.7798 07 566 0.8039 0.7854 0.7909 0.6426 0.6183 0.5949

pertinent to queries and employing attention propagation for effica-


cious reasoning within such subgraphs. TAL-TKGC, meanwhile, in-
corporates an attention mechanism by devising a temporal attention
module to discern profound connections between entities and relations
at semantic levels, alongside proposing an importance-weighted Graph
Convolutional Network (GCN) to assimilate structural information. Al-
though TAL-TKGC’s performance on metrics such as Mean Reciprocal
Rank (MRR), Hits@1, and Hits@3 is not as commendable, it excels in
Hits@10, surpassing most baseline models.
MUESEV, TiTer, FITCARL, and CluSTeR all employ reinforcement
learning for temporal path inference. CluSTeR introduces a clue search
phase to mitigate noise and enhance the precision of temporal reason-
ing. FITCARL, targeting newly emerged unseen entities, innovates with
a time-aware Transformer incorporating a time-aware positional encod-
ing methodology to optimize the utilization of few-shot information for Fig. 5. Result on ICEWS14 dataset under different scales of training samples.
representation learning, achieving notable success in Hits@10.
RE-GCN, CEN, RE-GAT, RE-H2 AN, and RPC aim at learning good
representations of entities and relations. RE-GCN approaches the mod-
explicit relation paths with implicit embeddings, improving accuracy
eling of Knowledge Graph (KG) sequences with fixed-length times-
of temporal reasoning while preserving model interpretability and
tamps, whereas CEN refines this model to accommodate variable-
transferability.
length evolution patterns, thus surpassing RE-GCN in performance.
In addition, it is worth noting that logical rule-based methods
MetaTKG approaches TKG reasoning as temporal meta-tasks, leveraging
demonstrate effectiveness in predicting events that contain rules ex-
a temporal meta-learner to assimilate evolving meta-knowledge. The
tracted from the training set. However, their performance may slightly
integration of MetaTKG into RE-GCN results in performance improve-
ments across various metrics. RE-GAT, RE-H2 AN, and RPC focus on decline when dealing with unseen rules that were not present in the
evolutional representations, with RE-H2 AN utilizing an entity hyper- training data. ILR-IR can adapt to new rule patterns and use the
graph with multiple hierarchies to capture the evolutionary pattern trained model to obtain an effective representation of any relational
across different semantic granularities. Additionally, RPC introduces a path. Furthermore, the correlation reasoning module of ILR-IR obtains
time2vector encoder to produce model-agnostic time vectors that aid interaction preference between entities according to the semantic pref-
time-related decoders in scoring. HGLS evaluates both long-term and erence and interaction frequency, which further improves reasoning
short-term representations of entities and relations, achieving superior performance of the model.
performance alongside RPC among several embedding-based baselines. Inspired by Zamri, Azhar, Mansor, Alway, and Kasihmuddin (2022),
Our proposed ILR-IR outperforms these baselines over three datasets. we conduct Friedman Test to further validate the superiority of the
Compared with ICEWS14 and ICEWS0515, the advantage on ICEWS18 proposed model as compared to other baselines. The null hypothesis,
is not very obvious. This may be because there are too many entities in denoted as 𝐻0, posits that there exists no statistically significant differ-
the ICEWS18 data set, which introduces many noise relationship paths, ence between our proposed ILR-IR and other baselines. The significance
making it difficult to learn effective logic rules. level is 𝛼 = 0.05. As shown in Table 3, the 𝑝-values are obviously smaller
TECHS utilizes a temporal encoder to learn evolutional representa- than 0.05 on three metrics: MRR, Hits@1 and Hits@10. Therefore,
tions of entities and relations, and then learns logical rules within the the null hypothesis is rejected, that is, our proposed ILR-IR has a
decoder. TECHS outperforms logical rule-based TLogic, but it cannot statistically significant difference compared with other baselines.
model unseen entities due to limitation of entities in the training set.
Both of our proposed ILR-IR and TLogic have good transferability to 5.5. Ablation study
datasets that share relation vocabulary. In contrast to TLogic, which
employs statistical methods to evaluate the confidence of logical rules, To test whether rule-based relation path reasoning module and the
our model utilizes learned path embeddings to dynamically assess the entity correlation reasoning module can contribute to ILR-IR model, we
rule confidence based on causal logic. Instead of relying solely on conduct ablation experiments. Table 4 shows experimental results of
statistical measures, we leverage the distance between embedding vec- two variants of ILR-IR on all datasets. ILR-IR 𝑤∕𝑜 RP and ILR-IR 𝑤∕𝑜
tors to effectively capture the similarity between paths and relations. EC denote two variants of ILR-IR that remove the rule-based relation
The rule-based relation path reasoning module of ILR-IR combines path reasoning module and the entity correlation reasoning module,

9
X. Mei et al. Neural Networks 174 (2024) 106219

Fig. 6. Result on three dataset with different 𝑚.

respectively. As can be seen from Table 4, when entity correlation entities within nearly 𝑚 timestamps. And the relation path reasoning
reasoning module is removed, ILR-IR 𝑤∕𝑜 EC model performs poorly, module can capture long-term relational dependencies between events,
especially on ICEWS015. By analyzing the dataset, it can be found so as to handle this situation well. A harmonious and complementary
that the time granules in ICEWS015 is 4017, which is much higher relationship can be achieved between the two modules, which promotes
than the other two datasets, so that many related events with a large each other and improving performance of the overall model.
time span are extracted from this dataset. Since many noise relation
paths composed of events with a large time span are extracted, ILR- 5.6. Zero-shot reasoning
IR 𝑤∕𝑜 EC model cannot well filter out the optimal rules. And when
only the entity correlation module is used, ILR-IR 𝑤∕𝑜 RP model can Our proposed ILR-IR model is designed to support zero-shot rea-
achieve better performance. From which we can find that, events that soning, meaning it can be transferred to new datasets that share com-
may occur at the current moment can be predicted by using the recent mon relations with the training dataset. In other words, the ILR-IR
interaction preferences between entities. Comparing the results of ILR- model is initially trained on one dataset and then applied to the
IR 𝑤∕𝑜 RP and ILR-IR 𝑤∕𝑜 EC, we can also find that Hit@1 and Hit@3 other two datasets for reasoning purposes. To evaluate the zero-shot
of ILR-IR 𝑤∕𝑜 RP are higher than that of ILR-IR 𝑤∕𝑜 EC on ICEWS14 reasoning performance of the ILR-IR model, we conduct experiments
and ICEWS18, but Hit@10 is lower than that of ILR-IR 𝑤∕𝑜 EC. It on the ICEWS14, ICEWS0515, and ICEWS18 datasets. Results in Ta-
is due to the fact that the entity correlation reasoning module will ble 5 shows, when the model is trained on ICEWS14 and applied to
not be able to predict correctly when there is no interaction between ICEWS0515, the three evaluation metrics are only slightly lower than

10
X. Mei et al. Neural Networks 174 (2024) 106219

Table 6
Path reasoning for query (Likud, Consult, ?, 𝑡) on ICEWS14. Since the dataset does not provide accurate time, we use 𝑡 to
denote the time of the event.
Path and rule Score Target entity
𝐶𝑜𝑛𝑠𝑢𝑙𝑡
(1) P: Likud ←←←←←←←←←←←←→
← Benjamin Netanyahu 0.9969
𝑡−141
R: Consult ⇒ Consult(Repeat)
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
(2) P: Benjamin Netanyahu ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Likud 0.8618
𝑡−236
R: Make statement ⇒ Consult−1 Benjamin Netanyahu

𝐸𝑥𝑝𝑟𝑒𝑠𝑠 𝑖𝑛𝑡𝑒𝑛𝑡 𝑡𝑜 𝑚𝑒𝑒𝑡 𝑜𝑟 𝑛𝑒𝑔𝑜𝑡𝑖𝑎𝑡𝑒 ( )
(3) P: Benjamin Netanyahu ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Likud 0.6227
𝑡−67
R: Express intent to meet or negotiate ⇒ Consult−1
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
(4) P: Likud ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Benjamin Netanyahu 0.3445
𝑡−32
R: Make statement ⇒ Consult
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
(5) P: Shas ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Likud 0.6743
𝑡−299
R: Make statement ⇒ Consult−1
Shas
𝑀𝑎𝑘𝑒 𝑜𝑝𝑡𝑖𝑚𝑖𝑠𝑡𝑖𝑐 𝑐𝑜𝑚𝑚𝑒𝑛𝑡
(6) P: Likud ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Shas 0.0720
𝑡−124
R: Make optimistic comment ⇒ Consult−1
𝐶𝑟𝑖𝑡𝑖𝑐𝑖𝑧𝑒 𝑜𝑟 𝑑𝑒𝑛𝑜𝑢𝑛𝑐𝑒
(7) P: Likud ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Police (Israel) 0.2595 Police (Israel)
𝑡−58
R: Criticize or denounce ⇒ Consult
𝐸𝑥𝑝𝑟𝑒𝑠𝑠 𝑎𝑐𝑐𝑜𝑟𝑑
(8) P: Likud ←←←←←←←←←←←←←←←←←←←←←←←←←→
← John Kerry 0.0605
𝑡−284
R: Express accord ⇒ Consult
John Kerry
𝑈 𝑠𝑒 𝑐𝑜𝑛𝑣𝑒𝑛𝑡𝑖𝑜𝑛𝑎𝑙 𝑚𝑖𝑙𝑖𝑡𝑎𝑟𝑦 𝑓 𝑜𝑟𝑐𝑒
(9) P: Likud ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← John Kerry −0.0152
𝑡−274
R: Use conventional military force ⇒ Consult

the model trained on ICEWS015, but still significantly better than other the value of 𝑚 increases, the predicted MRR and Hits@3 indicators
state-of-the-art baselines. It fully demonstrates superior performance of ILR-IR model gradually decrease. On the two other datasets, ILR-IR
of our proposed ILR-IR model for zero-shot inductive reasoning. Sim- model shows similar results as 𝑚 increases. Subsequently, we perform
ilarly, when the models trained on either dataset are transferred to fine-grained parameter fine-tuning, as shown in the three figures on the
the other two datasets, they achieve similar superior performance, fur- right part of Fig. 6. On three datasets, as 𝑚 increases, the predicted MRR
ther validating robustness of our proposed ILR-IR model for zero-shot and Hits@3 both increase first and then decrease. When the value of 𝑚
reasoning. is too small, interaction events cannot be captured between many enti-
ties, and the entity correlation reasoning module cannot contribute to
5.7. Proportions of the training data the overall model. Finally, the optimal 𝑚 values selected on ICEWS18,
ICEWS14 and ICEWS0515 are 10, 13, and 150, respectively.
The performance of our proposed ILR-IR model is evaluated on
the ICEWS14 dataset across different scales of training samples. The 5.9. Case study
results of this evaluation are presented in Fig. 5. We divide training
set by timestamp and evaluate performance of ILR-IR model separately 5.9.1. Path reasoning analysis
To visualize logical rules learned by the rule-based relation paths
when training with events in different proportions of timestamps. Fig. 5
reasoning module, we present the path reasoning results for query
shows that as the proportion of the training dataset increases, the
(Likud, Consult,?, 𝑡) on ICEWS14 dataset in Table 6. The table displays
trained model fits better and better, and the predicted MRR metric
the paths from the subject entity to potential object entities, along with
gradually increases. Our proposed ILR-IR model far outperforms the
the rules constructed from these relation paths and the corresponding
best baseline CluSTeR even when training the model with only events
scores. The purpose of the relation paths reasoning module is to identify
within the top 10% of timestamps. When the proportion of training
the target entity that aligns with the most logical rule.
data reaches 50% of full training set, the model shows similar perfor-
For this example, the highest score is the repetition rule (1) that
mance as when trained with full training set. These findings suggest
occurs at 𝑡 − 141, which shows that the model can effectively capture
that our model can achieve excellent performance even when trained
the repetition pattern. The semantic logic of (2) and (5) are the same,
with a limited number of samples, making it well-suited for handling
but the relative temporal is different, time of the causal event in (2) is
large-scale data in reasoning tasks.
closer to the current moment, so the corresponding rule gets a higher
score, indicating that our model can effectively represent temporal
5.8. Parameters analysis information. Different confidence scores are obtained between rules (2)
and (4) due to the difference in orientation, indicating irreversibility
In entity correlation reasoning module, we obtain interaction pref- of the rules. As we can see from Table 6, reason relations that may
erences of entities at the current moment according to the one-hop lead to the behavior ‘‘Consult’’ include: ‘‘Make statement’’, ‘‘Express
interaction between entities within the previous 𝑚 timestamps. In order intent to meet or negotiate’’ and repeated pattern of ‘‘Consult’’, which
to select an appropriate 𝑚, we conduct detailed parameter tuning is reasonable from an emotional analysis.
experiment, as shown in Fig. 6. We perform coarse-grained parameter
range selection and fine-grained parameter fine-tuning, respectively. 5.9.2. Sample analysis
The three figures on the left part of Fig. 6 respectively represent the We present three examples of entity predictions on the ICEWS14
selection of coarse-grained parameter pools on the three datasets, and dataset in Table 7. From left to right in the table are path rules extracted
the performance of the model under different 𝑚 values is tested. 𝑚 takes in the relation paths reasoning module, the corresponding path score
values of 50, 100, 150, 200, 250, 300 on ICEWS14 and ICEWS18, and 𝑆𝑟 , the interaction relations extracted by entity correlation reasoning
500, 1000, 1500, 2000, 2500, 3000 on ICEWS0515. On ICEWS14, as module, the corresponding score 𝑆𝑒 , and candidate target entities.

11
X. Mei et al. Neural Networks 174 (2024) 106219

Table 7
Entity prediction visualization on ICEWS14. Since the dataset does not provide accurate time, we use 𝑡 to denote the time of the event.
Query Path 𝑆𝑟 Interaction 𝑆𝑒 Target entity
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
(Conference of Nigerian CNPP ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Government 0.5080 Government
𝑡−43 Make statement 0.4744
𝐴𝑐𝑐𝑢𝑠𝑒 (Nigeria)
Political Parties(CNPP), CNPP ←←←←←←←←←←←→
← Government 0.1098 √
Make an appeal or 𝑡−258 ( )
𝐴𝑐𝑐𝑢𝑠𝑒
request, ?, 𝑡) CNPP ←←←←←←←←←←←→
← Citizen 0.2093 Accuse 0.1831 Citizen
𝑡−43
(Nigeria)
𝐶𝑜𝑛𝑠𝑢𝑙𝑡
Protester ←←←←←←←←←←←←→
← Carrie Lam 0.6498 Consult
𝑡−40 Protester
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡 Make statement
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Protester 0.5008 0.1525 (Hong Kong)
𝑡−33 Express intent to meet or negotiate √
𝐸𝑛𝑔𝑎𝑔𝑒 𝑖𝑛 𝑛𝑒𝑔𝑜𝑡𝑖𝑎𝑡𝑖𝑜𝑛 Engage in negotiation ( )
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Protester 0.3252
𝑡−39
𝐶𝑜𝑛𝑠𝑢𝑙𝑡
(Carrie Lam, Student ←←←←←←←←←←←←→
← Carrie Lam 0.6167 Consult
𝑡−33
Make an appeal or 𝐸𝑥𝑝𝑟𝑒𝑠𝑠 𝑖𝑛𝑡𝑒𝑛𝑡 𝑡𝑜 𝑚𝑒𝑒𝑡 𝑜𝑟 𝑛𝑒𝑔𝑜𝑡𝑖𝑎𝑡𝑒
Criticize or denounce
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Student 0.4654 Halt negotiations 0.1652 Student
request, ?, 𝑡) 𝑡−40
Express intent to meet or negotiate (Hong Kong)
𝐸𝑛𝑔𝑎𝑔𝑒 𝑖𝑛 𝑛𝑒𝑔𝑜𝑡𝑖𝑎𝑡𝑖𝑜𝑛
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Student 0.2685 Engage in negotiation
𝑡−35
𝐸𝑥𝑝𝑟𝑒𝑠𝑠 𝑖𝑛𝑡𝑒𝑛𝑡 𝑡𝑜 𝑚𝑒𝑒𝑡 𝑜𝑟 𝑛𝑒𝑔𝑜𝑡𝑖𝑎𝑡𝑒
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← China 0.4158 Express intent to meet or negotiate 0.0522 China
𝑡−32
𝑀𝑎𝑘𝑒 𝑜𝑝𝑡𝑖𝑚𝑖𝑠𝑡𝑖𝑐 𝑐𝑜𝑚𝑚𝑒𝑛𝑡
Carrie Lam ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Lawmaker 0.2462 Make optimistic comment 0.0248 Lawmaker
𝑡−9
(Hong Kong)
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
Military Personnel ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Military 0.9971 Make statement Military
𝑡−1
Praise or endorse 0.5387 (Philippines)
𝑀𝑎𝑘𝑒 𝑎𝑛 𝑎𝑝𝑝𝑒𝑎𝑙 𝑜𝑟 𝑟𝑒𝑞𝑢𝑒𝑠𝑡 √
Military Personnel ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Military 0.4718 Demand ( )
𝑡−294

(Military Personnel 𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡


Military Personnel ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Crimina 0.9971 Criminal
(Philippines), 𝑡−9
Make statement 0.3448
Make statement, ?, 𝑡) 𝑈 𝑠𝑒 𝑢𝑛𝑐𝑜𝑛𝑣𝑒𝑛𝑡𝑖𝑜𝑛𝑎𝑙 𝑣𝑖𝑜𝑙𝑒𝑛𝑐𝑒 (Philippines)
Criminal ←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←←→
← Military Personnel 0.6383
𝑡−125
𝑀𝑎𝑘𝑒 𝑠𝑡𝑎𝑡𝑒𝑚𝑒𝑛𝑡
Military Personnel ←←←←←←←←←←←←←←←←←←←←←←←←←→
← Lawyer/Attorney 0.9789
𝑡−237
Accuse 0.1025 Lawyer/Attorney
𝐴𝑐𝑐𝑢𝑠𝑒 (Philippines)
Lawyer/Attorney ←←←←←←←←←←←→
← Military Personnel 0.7027
𝑡−32

For the query (Conference of Nigerian Political Parties, Make an the perspective of logic rules, we incorporate a relation path reason-
appeal or request, ?, 𝑡), the ground-truth target entity ‘‘Government ing module that uncovers profound absolute causal relationships by
(Nigeria)’’ is predicted with higher score than other candidate entities analyzing relation paths. This module aims to identify and leverage
in both the relation path reasoning module and the entity correlation the intricate connections among different relations. Secondly, consider-
reasoning module. ing interaction preferences, we design an entity correlation reasoning
For the query (Carrie Lam, Make an appeal or request, ?, 𝑡), candi- module that utilizes historical interaction data between entities to infer
date entity ‘‘Student (Hong Kong)’’ has more frequent interactions with their relationships. This module focuses on capturing the patterns and
subject entity ‘‘Carrie Lam recently’’, so it gets a higher prediction score tendencies emerging from their past interactions. These two modules
than correct entity ‘‘Protester (Hong Kong)’’, and makes the module fail work in synergy, complementing each other to enhance the overall
to predict correctly. The relation path reasoning module can effectively performance of the ILR-IR model. By combining the logic rules and
alleviate such deviation, it correctly predicts the highest path score for interaction preferences, our model achieves a harmonious balance,
target entity ‘‘Protester (Hong Kong)’’, so that the model can finally resulting in improved reasoning capabilities. To train the ILR-IR model,
predict the correct target entity. we devise training tasks from both a coarse-grained quadruple perspec-
For the query (Military Personnel(Philippines), Make statement, ?, tive and a fine-grained rule perspective. Additionally, we propose a
𝑡), relation path reasoning module captures semantically identical paths one-class augmented matching loss for optimization during the training
for three candidate entities. This module can capture relative timing process. One of the notable advantages of ILR-IR is its ability to
transfer knowledge and perform zero-shot reasoning on new datasets
information, so that the path occurring at time 𝑡 − 237 has lower score
with a shared relation vocabulary. This characteristic makes the model
than paths at time 𝑡 − 1 and 𝑡 − 9. However, due to the small time
versatile and adaptable to different knowledge graphs. Experimental
difference between 𝑡 − 1 and 𝑡 − 9, the module cannot distinguish these
results demonstrate the superiority of our proposed ILR-IR compared to
two paths, which brings difficulties to accurate prediction. However,
state-of-the-art baseline models. The performance evaluations confirm
entity correlation inference module captures more historical interac-
the effectiveness and potential of our proposed model in addressing
tions for the correct entity pair, and obtains higher prediction score,
temporal reasoning challenges in knowledge graphs. However, we have
which compensates for errors of relation path reasoning module.
omitted the temporal information in the interaction preference module.
To sum up, relation path reasoning module and entity correlation
In the future, we plan to model the evolutionary process of interaction
reasoning module have their own strengths in different situations,
preferences to better track real-time interaction preferences between
and combining advantages of the two module can improve prediction entities.
accuracy and robustness of our proposed model.
CRediT authorship contribution statement
6. Conclusion
Xin Mei: Software, Methodology. Libin Yang: Validation, Data
In this research paper, we introduce a novel and interpretable model curation. Zuowei Jiang: Writing – original draft, Validation. Xiaoyan
called ILR-IR for reasoning in temporal knowledge graphs. Our model is Cai: Writing – review & editing, Supervision, Formal analysis. De-
developed based on two key aspects: mining logic rules among relations hong Gao: Supervision. Junwei Han: Writing – review & editing,
and predicting interaction preferences between entities. Firstly, from Supervision. Shirui Pan: Supervision.

12
X. Mei et al. Neural Networks 174 (2024) 106219

Declaration of competing interest Garcia-Duran, A., Dumančić, S., & Niepert, M. (2018). Learning sequence encoders
for temporal knowledge graph completion. In Proceedings of the 2018 conference
on empirical methods in natural language processing (pp. 4816–4821). URL: https:
The authors declare that they have no known competing finan-
//aclanthology.org/D18-1516/.
cial interests or personal relationships that could have appeared to Gardner, F. M. (1993). Interpolation in digital modems. I. Fundamentals. IEEE
influence the work reported in this paper. Transactions on Communications, 41(3), 501–507. http://dx.doi.org/10.1109/26.
221081.
Data availability Goel, R., Kazemi, S. M., Brubaker, M., & Poupart, P. (2020). Diachronic embedding
for temporal knowledge graph completion. In Proceedings of the AAAI conference
on artificial intelligence (pp. 3988–3995). URL: https://ojs.aaai.org/index.php/AAAI/
Data will be made available on request. article/view/5815.
Gottschalk, S., & Demidova, E. (2018). Eventkg: a multilingual event-centric temporal
Acknowledgments knowledge graph. In European semantic web conference (pp. 272–287). Springer,
http://dx.doi.org/10.1007/978-3-319-93417-4_18.
Guo, J., Chen, M., Zhang, Y., Huang, J., & Liu, Z. (2023). Hierarchical hypergraph
This work was supported in part by the National Natural Sci- recurrent attention network for temporal knowledge graph reasoning. In IEEE
ence Foundation of China under Grants 62372380, U20B2065 and international conference on acoustics, speech and signal processing ICASSP 2023,
U22B2036, and in part by the Natural Science Basic Research Program rhodes island, Greece, June 4-10, 2023 (pp. 1–5). IEEE, http://dx.doi.org/10.1109/
ICASSP49357.2023.10095378.
of Shaanxi under Grant 2024JC-YBMS-513, and sponsored by Innova-
Guo, L., Sun, Z., & Hu, W. (2019). Learning to exploit long-term relational dependencies
tion Foundation for Doctor Dissertation of Northwestern Polytechnical in knowledge graphs. In K. Chaudhuri, & R. Salakhutdinov (Eds.), Proceedings of
University, China under CX2023061. machine learning research: vol. 97, Proceedings of the 36th international conference
on machine learning, ICML 2019, 9-15 June 2019, long beach, california, USA (pp.
References 2505–2514). PMLR, URL: http://proceedings.mlr.press/v97/guo19c.html.
Guo, C., Wang, T., Lin, Y., Chen, H., Yu, H., Zhu, C., et al. (2022). Modeling unseen
entities from a semantic evidence view in temporal knowledge graphs. In 2022 7th
Bai, L., Chai, D., & Zhu, L. (2023). RLAT: multi-hop temporal knowledge graph
IEEE international conference on data science in cyberspace DSC, (pp. 333–339). IEEE.
reasoning based on reinforcement learning and attention mechanism. Knowledge-
Han, Z., Chen, P., Ma, Y., & Tresp, V. (2021). Explainable subgraph reasoning for
Based Systems, 269, Article 110514. http://dx.doi.org/10.1016/J.KNOSYS.2023.
forecasting on temporal knowledge graphs. In 9th international conference on learning
110514, URL: https://doi.org/10.1016/j.knosys.2023.110514.
representations, ICLR. URL: https://openreview.net/forum?id=pGIHq1m7PU.
Bai, L., Chen, M., Zhu, L., & Meng, X. (2023). Multi-hop temporal knowledge graph
Hua, W., Zheng, K., & Zhou, X. (2015). Microblog entity linking with social tem-
reasoning with temporal path rules guidance. Expert Systems with Applications, 223,
poral context. In Proceedings of the 2015 ACM SIGMOD international conference
Article 119804. http://dx.doi.org/10.1016/J.ESWA.2023.119804.
on management of data (pp. 1761–1775). http://dx.doi.org/10.1145/2723372.
Bai, L., Yu, W., Chai, D., Zhao, W., & Chen, M. (2023). Temporal knowledge graphs
2751522.
reasoning with iterative guidance by temporal logical rules. Information Sciences,
Jin, W., Qu, M., Jin, X., & Ren, X. (2020). Recurrent event network: Autoregressive
621, 22–35. http://dx.doi.org/10.1016/J.INS.2022.11.096.
structure inference over temporal knowledge graphs. In Proceedings of the 2020 con-
Bansal, T., Juan, D.-C., Ravi, S., & McCallum, A. (2019). A2N: Attending to neighbors
ference on empirical methods in natural language processing EMNLP, (pp. 6669–6683).
for knowledge graph inference. In Proceedings of the 57th annual meeting of
http://dx.doi.org/10.18653/v1/2020.emnlp-main.541.
the association for computational linguistics (pp. 4387–4392). http://dx.doi.org/10.
18653/v1/p19-1431. Kejriwal, M., Lopez, V., Sequeda, J. F., Gottschalk, S., Demidova, E., Kejriwal, M., et
al. (2019). Eventkg – the hub of event knowledge on the web – and biographical
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal
timeline generation. Semant. Web, 10(6), 1039–1070. http://dx.doi.org/10.3233/
of Computational Science, 2(1), 1–8. http://dx.doi.org/10.1016/j.jocs.2010.12.007.
SW-190355.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013).
Translating embeddings for modeling multi-relational data. Advances in Neural In- Korkmaz, G., Cadena, J., Kuhlman, C. J., Marathe, A., Vullikanti, A., & Ramakrish-
formation Processing Systems, 26, URL: https://proceedings.neurips.cc/paper/2013/ nan, N. (2015). Combining heterogeneous data sources for civil unrest forecasting.
hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html. In Proceedings of the 2015 IEEE/ACM international conference on advances in
Boschee, E., Lautenschlager, J., O’Brien, S., Shellman, S., Starz, J., & Ward, M. (2015). social networks analysis and mining 2015 (pp. 258–265). http://dx.doi.org/10.1145/
ICEWS coded event data. Harvard Dataverse, 12, http://dx.doi.org/10.7910/DVN/ 2808797.2808847.
28075. Lacroix, T., Obozinski, G., & Usunier, N. (2020). Tensor decompositions for tem-
Brezinski, C. (1982). Algorithm 585: A subroutine for the general interpolation and poral knowledge base completion. In 8th international conference on learning
extrapolation problems. ACM Transactions on Mathematical Software, 8(3), 290–301. representations. URL: https://openreview.net/forum?id=rke2P1BFwS.
http://dx.doi.org/10.1145/356004.356008. Leblay, J., & Chekol, M. W. (2018). Deriving validity time in knowledge graph.
Brezinski, C., & Redivo-Zaglia, M. (1991). Extrapolation methods - theory and practice. In Companion proceedings of the the web conference 2018 (pp. 1771–1776). http:
In Studies in computational mathematics: vol. 2, North-Holland. //dx.doi.org/10.1145/3184558.3191639.
Das, R., Dhuliawala, S., Zaheer, M., Vilnis, L., Durugkar, I., Krishnamurthy, A., et Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., et
al. (2018). Go for a walk and arrive at the answer: Reasoning over paths in al. (2015). Dbpedia–a large-scale, multilingual knowledge base extracted from
knowledge bases using reinforcement learning. In International conference on learning wikipedia. Semantic web, 6(2), 167–195. http://dx.doi.org/10.3233/SW-140134.
representations. Lei, D., Jiang, G., Gu, X., Sun, K., Mao, Y., & Ren, X. (2020). Learning collaborative
Dasgupta, S. S., Ray, S. N., & Talukdar, P. (2018). Hyte: Hyperplane-based temporally agents with rule guidance for knowledge graph reasoning. In Proceedings of the
aware knowledge graph embedding. In Proceedings of the 2018 conference on 2020 conference on empirical methods in natural language processing EMNLP, (pp.
empirical methods in natural language processing (pp. 2001–2011). http://dx.doi.org/ 8541–8547).
10.18653/v1/d18-1225. Li, Z., Feng, S., Shi, J., Zhou, Y., Liao, Y., Yang, Y., et al. (2023). Future event prediction
Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2018). Convolutional 2d based on temporal knowledge graph embedding. Computer Systems Science and
knowledge graph embeddings. In Proceedings of the AAAI conference on artificial Engineering, 44(3), 2411–2423. http://dx.doi.org/10.32604/CSSE.2023.026823.
intelligence. URL: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/ Li, Z., Guan, S., Jin, X., Peng, W., Lyu, Y., Zhu, Y., et al. (2022). Complex evolutional
17366. pattern learning for temporal knowledge graph reasoning. In Proceedings of the 60th
Ding, Z., Wu, J., Li, Z., Ma, Y., & Tresp, V. (2023). Improving few-shot inductive annual meeting of the association for computational linguistics (volume 2: short papers)
learning on temporal knowledge graphs using confidence-augmented reinforcement (pp. 290–296).
learning. In Lecture notes in computer science: vol. 14171, Machine learning and Li, Z., Jin, X., Guan, S., Li, W., Guo, J., Wang, Y., et al. (2021). Search from history
knowledge discovery in databases: research track - European conference, ECML PKDD and reason for future: Two-stage reasoning on temporal knowledge graphs. In
2023, turin, Italy, September 18-22, 2023, proceedings, part III (pp. 550–566). Proceedings of the 59th annual meeting of the association for computational linguistics
Springer, http://dx.doi.org/10.1007/978-3-031-43418-1_33. and the 11th international joint conference on natural language processing (volume 1:
Francis-Landau, M., Durrett, G., & Klein, D. (2016). Capturing semantic similarity long papers) (pp. 4732–4743).
for entity linking with convolutional neural networks. In Proceedings of the 2016 Li, Z., Jin, X., Li, W., Guan, S., Guo, J., Shen, H., et al. (2021). Temporal knowledge
conference of the North American chapter of the association for computational linguistics: graph reasoning based on evolutional representation learning. In Proceedings of the
human language technologies (pp. 1256–1261). http://dx.doi.org/10.18653/V1/N16- 44th international ACM SIGIR conference on research and development in information
1150. retrieval (pp. 408–417). http://dx.doi.org/10.1145/3404835.3462963.
Galárraga, L. A., Teflioudi, C., Hose, K., & Suchanek, F. (2013). AMIE: association rule Li, T., Wang, W., Li, X., Wang, T., Zhou, X., & Huang, M. (2023). Embedding uncertain
mining under incomplete evidence in ontological knowledge bases. In Proceedings temporal knowledge graphs. Mathematics, 11(3), 1–16, URL: https://ideas.repec.
of the 22nd international conference on world wide web (pp. 413–422). org/a/gam/jmathe/v11y2023i3p775-d1056673.html.

13
X. Mei et al. Neural Networks 174 (2024) 106219

Liang, K., Meng, L., Liu, M., Liu, Y., Tu, W., Wang, S., et al. (2023). Learn from Sun, Z., Deng, Z.-H., Nie, J.-Y., & Tang, J. (2019). Rotate: Knowledge graph embedding
relational correlations and periodic events for temporal knowledge graph reasoning. by relational rotation in complex space. In 7th international conference on learning
In Proceedings of the 46th international ACM SIGIR conference on research and representations. URL: https://openreview.net/forum?id=HkgEQnRqYQ.
development in information retrieval, SIGIR 2023, taipei, Taiwan, July 23-27, 2023 Sun, H., Zhong, J., Ma, Y., Han, Z., & He, K. (2021). TimeTraveler: Reinforcement
(pp. 1559–1568). ACM, http://dx.doi.org/10.1145/3539618.3591711. learning for temporal knowledge graph forecasting. In Proceedings of the 2021
Lin, Q., Liu, J., Mao, R., Xu, F., & Cambria, E. (2023). TECHS: Temporal logical graph conference on empirical methods in natural language processing (pp. 8306–8319).
networks for explainable extrapolation reasoning. In Proceedings of the 61st annual Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., & Bouchard, G. (2016). Complex
meeting of the association for computational linguistics (volume 1: long papers) (pp. embeddings for simple link prediction. In International conference on machine learning
1281–1293). Toronto, Canada: Association for Computational Linguistics, http:// (pp. 2071–2080). PMLR, URL: http://proceedings.mlr.press/v48/trouillon16.html.
dx.doi.org/10.18653/v1/2023.acl-long.71, URL: https://aclanthology.org/2023.acl- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al.
long.71. (2017). Attention is all you need. Advances in Neural Information Processing Systems,
Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation 30.
embeddings for knowledge graph completion. In Proceedings of the twenty-ninth Wang, H., Li, S., Pan, R., & Mao, M. (2019). Incorporating graph attention mechanism
AAAI conference on artificial intelligence (pp. 2181–2187). URL: http://www.aaai. into knowledge graph reasoning based on deep reinforcement learning. In Proceed-
org/ocs/index.php/AAAI/AAAI15/paper/view/9571. ings of the 2019 conference on empirical methods in natural language processing and
Lin, X. V., Socher, R., & Xiong, C. (2018). Multi-hop knowledge graph reasoning with the 9th international joint conference on natural language processing (EMNLP-IJCNLP)
reward shaping. In Proceedings of the 2018 conference on empirical methods in natural (pp. 2623–2631).
language processing (pp. 3243–3253). Wang, X., Lyu, S., Wang, X., Wu, X., & Chen, H. (2023). Temporal knowledge
Liu, Y., Ma, Y., Hildebrandt, M., Joblin, M., & Tresp, V. (2022). Tlogic: Temporal logical graph embedding via sparse transfer matrix. Information Sciences, 623, 56–69.
rules for explainable link forecasting on temporal knowledge graphs. Proceedings http://dx.doi.org/10.1016/J.INS.2022.12.019.
of the AAAI Conference on Artificial Intelligence, URL: https://arxiv.org/abs/2112. Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge graph embedding: A
08025. survey of approaches and applications. IEEE Transactions on Knowledge and Data
Lunardi, A., et al. (2009). Interpolation theory: vol. 9, Edizioni della normale Pisa, URL: Engineering, 29(12), 2724–2743. http://dx.doi.org/10.1109/TKDE.2017.2754499.
https://people.dmi.unipr.it/alessandra.lunardi/LectureNotes/SNS1999.pdf. Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge graph embedding by trans-
Luo, K., Lin, F., Luo, X., & Zhu, K. (2018). Knowledge base question answering via lating on hyperplanes. In Proceedings of the AAAI conference on artificial intelligence.
encoding of complex query graphs. In Proceedings of the 2018 conference on empirical URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531.
methods in natural language processing (pp. 2185–2194). http://dx.doi.org/10.18653/
Wu, J., Xu, Y., Zhang, Y., Ma, C., Coates, M., & Cheung, J. C. K. (2021). TIE: A
v1/d18-1242.
framework for embedding-based incremental temporal knowledge graph comple-
Mei, X., Yang, L., Cai, X., & Jiang, Z. (2022). An adaptive logical rule embedding
tion. In Proceedings of the 44th international ACM SIGIR conference on research and
model for inductive reasoning over temporal knowledge graphs. In Proceedings of
development in information retrieval SIGIR ’21, (pp. 428–437). http://dx.doi.org/10.
the 2022 conference on empirical methods in natural language processing (pp. 7304–
1145/3404835.3462961.
7316). Abu Dhabi, United Arab Emirates: Association for Computational Linguistics,
Xia, Y., Zhang, M., Liu, Q., Wu, S., & Zhang, X. (2022). Metatkg: Learning evolutionary
URL: https://aclanthology.org/2022.emnlp-main.493.
meta-knowledge for temporal knowledge graph reasoning. In Proceedings of the
Meilicke, C., Chekol, M. W., Fink, M., & Stuckenschmidt, H. (2020). Reinforced anytime
2022 conference on empirical methods in natural language processing, EMNLP 2022,
bottom up rule learning for knowledge graph completion. arXiv preprint arXiv:
abu dhabi, United arab emirates, December 7-11, 2022 (pp. 7230–7240). Association
2004.04412 URL: https://arxiv.org/abs/2004.04412.
for Computational Linguistics, http://dx.doi.org/10.18653/V1/2022.EMNLP-MAIN.
Min, B., Grishman, R., Wan, L., Wang, C., & Gondek, D. (2013). Distant supervision for
487.
relation extraction with an incomplete knowledge base. In Proceedings of the 2013
Xiong, W., Hoang, T., & Wang, W. Y. (2017). DeepPath: A reinforcement learning
conference of the North American chapter of the association for computational linguistics:
method for knowledge graph reasoning. In EMNLP.
human language technologies (pp. 777–782). URL: https://aclanthology.org/N13-
Yang, B., Yih, W.-t., He, X., Gao, J., & Deng, L. (2015). Embedding entities and relations
1095/.
for learning and inference in knowledge bases. In 3rd international conference on
Muthiah, S., Huang, B., Arredondo, J., Mares, D., Getoor, L., Katz, G., et al. (2015).
learning representations. URL: http://arxiv.org/abs/1412.6575.
Planned protest modeling in news and social media. In Proceedings of the twenty-
Yih, S. W.-t., Chang, M.-W., He, X., & Gao, J. (2015). Semantic parsing via staged query
ninth AAAI conference on artificial intelligence. Citeseer, URL: http://www.aaai.org/
graph generation: Question answering with knowledge base. In Proceedings of the
ocs/index.php/IAAI/IAAI15/paper/view/9652.
joint conference of the 53rd annual meeting of the ACL and the 7th international joint
Nguyen, D. Q., Nguyen, T. D., Nguyen, D. Q., & Phung, D. (2018). A novel embedding
conference on natural language processing of the AFNLP. http://dx.doi.org/10.3115/
model for knowledge base completion based on convolutional neural network. In
v1/p15-1128.
Proceedings of the 2018 conference of the North American chapter of the association
for computational linguistics: human language technologies (pp. 327–333). http://dx. Yu, M., Guo, J., Yu, J., Xu, T., Zhao, M., Liu, H., et al. (2023). TBDRI: block
doi.org/10.18653/v1/n18-2053. decomposition based on relational interaction for temporal knowledge graph com-
Nickel, M., Murphy, K., Tresp, V., & Gabrilovich, E. (2015). A review of relational pletion. Applied Intelligence, 53(5), 5072–5084. http://dx.doi.org/10.1007/S10489-
machine learning for knowledge graphs. Proceedings of the IEEE, 104(1), 11–33. 022-03601-5.
http://dx.doi.org/10.1109/JPROC.2015.2483592. Zamri, N. E., Azhar, S. A., Mansor, M. A., Alway, A., & Kasihmuddin, M. S. M. (2022).
Nie, H., Zhao, X., Yao, X., Jiang, Q., Bi, X., Ma, Y., et al. (2023). Temporal-structural Weighted random k satisfiability for k=1, 2 (r2sat) in discrete hopfield neural
importance weighted graph convolutional network for temporal knowledge graph network. Applied Soft Computing, 126, Article 109312. http://dx.doi.org/10.1016/
completion. Future Generation Computer Systems, 143, 30–39. http://dx.doi.org/10. J.ASOC.2022.109312, URL: https://doi.org/10.1016/j.asoc.2022.109312.
1016/J.FUTURE.2023.01.012. Zeng, D., Liu, K., Chen, Y., & Zhao, J. (2015). Distant supervision for relation
Omran, P. G., Wang, K., & Wang, Z. (2019). Learning temporal rules from knowledge extraction via piecewise convolutional neural networks. In Proceedings of the 2015
graph streams. In AAAI spring symposium: combining machine learning with knowledge conference on empirical methods in natural language processing (pp. 1753–1762).
engineering. URL: http://ceur-ws.org/Vol-2350/paper15.pdf. http://dx.doi.org/10.18653/v1/d15-1203.
Phillips, L., Dowling, C., Shaffer, K., Hodas, N., & Volkova, S. (2017). Using social media Zhang, H., & Bai, L. (2023). Few-shot link prediction for temporal knowledge graphs
to predict the future: A systematic literature. arXiv preprint arXiv:1706.06134 URL: based on time-aware translation and attention mechanism. Neural Networks, 161,
http://arxiv.org/abs/1706.06134. 371–381. http://dx.doi.org/10.1016/J.NEUNET.2023.01.043.
Qu, M., & Tang, J. (2019). Probabilistic logic neural networks for reasoning. Advances Zhang, S., Liang, X., Li, Z., Feng, J., Zheng, X., & Wu, B. (2023). BiQCap: A
in Neural Information Processing Systems, 32. biquaternion and capsule network-based embedding model for temporal knowledge
Ren, X., Bai, L., Xiao, Q., & Meng, X. (2023). Hierarchical self-attention embedding for graph completion. In Lecture notes in computer science: vol. 13944, Database systems
temporal knowledge graph completion. In Proceedings of the ACM web conference for advanced applications - 28th international conference, DASFAA 2023, tianjin, China,
2023, WWW 2023, austin, TX, USA, 30 April 2023 - 4 May 2023 (pp. 2539–2547). April 17-20, 2023, proceedings, part II (pp. 673–688). Springer, http://dx.doi.org/
ACM, http://dx.doi.org/10.1145/3543507.3583397. 10.1007/978-3-031-30672-3_45.
Schlichtkrull, M., Kipf, T. N., Bloem, P., Berg, R. v. d., Titov, I., & Welling, M. Zhang, W., Paudel, B., Wang, L., Chen, J., Zhu, H., Zhang, W., et al. (2019). Iteratively
(2018). Modeling relational data with graph convolutional networks. In European learning embeddings and rules for knowledge graph reasoning. In The world wide
semantic web conference (pp. 593–607). Springer, http://dx.doi.org/10.1007/978-3- web conference (pp. 2366–2377).
319-93417-4_38. Zhang, M., Xia, Y., Liu, Q., Wu, S., & Wang, L. (2023). Learning long- and short-term
Signorini, A., Segre, A. M., & Polgreen, P. M. (2011). The use of Twitter to track representations for temporal knowledge graph reasoning. In Proceedings of the ACM
levels of disease activity and public concern in the US during the influenza a H1n1 web conference 2023, WWW 2023, austin, TX, USA, 30 April 2023 - 4 May 2023
pandemic. PLoS One, 6(5), Article e19467. http://dx.doi.org/10.1371/journal.pone. (pp. 2412–2422). ACM, http://dx.doi.org/10.1145/3543507.3583242.
0019467. Zhang, Z., Zhuang, F., Zhu, H., Shi, Z., Xiong, H., & He, Q. (2020). Relational graph
Speer, R., Chin, J., & Havasi, C. (2017). Conceptnet 5.5: An open multilingual graph neural network with hierarchical attention for knowledge graph completion. In
of general knowledge. In Proceedings of the AAAI conference on artificial intelligence. Proceedings of the AAAI conference on artificial intelligence (pp. 9612–9619). URL:
http://dx.doi.org/10.1609/AAAI.V31I1.11164. https://ojs.aaai.org/index.php/AAAI/article/view/6508.

14
X. Mei et al. Neural Networks 174 (2024) 106219

Zheng, S., Yin, H., Chen, T., Nguyen, Q. V. H., Chen, W., & Zhao, L. (2023). DREAM: Zhu, C., Chen, M., Fan, C., Cheng, G., & Zhang, Y. (2021). Learning from history:
adaptive reinforcement learning based on attention mechanism for temporal knowl- Modeling temporal knowledge graphs with sequential copy-generation networks.
edge graph reasoning. In Proceedings of the 46th international ACM SIGIR conference In Proceedings of the AAAI conference on artificial intelligence (pp. 4732–4740). URL:
on research and development in information retrieval, SIGIR 2023, taipei, Taiwan, July https://ojs.aaai.org/index.php/AAAI/article/view/16604.
23-27, 2023 (pp. 1578–1588). ACM, http://dx.doi.org/10.1145/3539618.3591671.

15

You might also like