From Event Logs To Goals: A Systematic Literature

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

From Event Logs to Goals: A Systematic Literature

Review of Goal-oriented Process Mining


Mahdi Ghasemi Prof. Daniel Amyot 
School of EECS School of EECS
University of Ottawa University of Ottawa
800 King Edward 800 King Edward
Ottawa, ON, K1N 6N5, Canada Ottawa, ON, K1N 6N5, Canada
E-mail: mghas020@uottawa.ca E-mail: damyot@uottawa.ca
Phone: (+1) 613-562-5800 ext. 6947

Abstract:
Process mining helps infer valuable insights about business processes using event logs,
whereas goal modeling focuses on the representation and analysis of competing goals of
stakeholders and systems. Although there are clear benefits in mining the goals of existing
processes, goal-oriented approaches that consider logs during model construction are still rare.
Process mining techniques, when generalizing large instance-level data into process models,
can be considered as a data-driven complement to use case / scenario elicitation. Requirements
engineers can exploit process mining techniques to find new system or process requirements
in order to align current practices and desired ones. This paper provides a systemic literature
review, based on 24 papers rigorously selected from four popular search engines in 2018, to
assess the state of goal-oriented process mining. Through two research questions, the review
highlights that the use of process mining in association with goals does not yet have a coherent
line of research, whereas intention mining (where goal models are mined) shows a meaningful
trace of research. Research about performance indicators measuring goals associated with
process mining is also sparse. Although the number of publications in process mining and goal
modeling is trending up, goal mining and goal-oriented process mining remain modest research
areas. Yet, synergetic effects achievable by combining goals and process mining can
potentially augment the precision, rationality and interpretability of mined models, and
eventually improve opportunities to satisfy system stakeholders.

Keywords: Business Process Management, Event Logs, Goal Mining, Goal Modeling,
Intention Mining, Performance Indicators, Process Mining, Requirements Engineering,
Systematic Literature Review.

Acknowledgments
This work is funded by the Discovery grant program of the National Science and Engineering
Council of Canada (NSERC). M. Ghasemi is further sponsored by the Ontario Graduate
Scholarship program and the NSERC Canada Graduate Scholarship program. The authors are
indebted to the anonymous reviewers for their feedback and suggestions for improvement.
1 Introduction
In organizations, processes are performed to meet the requirements of all corresponding
stakeholders and to achieve predefined goals. Processes are composed of activities that often
can be logged in files or databases during their execution. Process logging becomes
increasingly popular for enabling process optimization and compliance. Yet, finding some
efficient ways of monitoring processes, while considering the (often conflicting) goals of all
stakeholders, is an intricate task.
On one hand, process mining is an emerging research area and an evidence-based approach
looking to exploit event logs and to infer valuable insights about processes. In particular,
process mining plays an important role as a bridge between traditional model-based process
analysis (e.g., simulation) and data analysis techniques (e.g., data mining). This causes a huge
demand for data scientists who are not only able to analyze big data, but also to relate them to
actual operational processes. According to van der Aalst [100], process mining techniques fall
into one of three categories: i) discovery, where a model is produced using the event logs;
ii) conformance, where the data generated from the model is compared with the actual data in
event logs to assess reality against the model; and iii) enhancement, where an existing process
model is improved or/and extended.
On the other hand, goal modeling is a requirements engineering approach used to support
heuristic, qualitative, or formal reasoning schemes [46]. A goal is an objective meant to be
achieved by the system under consideration or one of its stakeholders. Goals are typically
modeled by some fundamental features such as their type and attributes, and by their links to
other goals and to other elements of a requirements model [106]. During the last decade,
several methods have been proposed that prescribe the application of goal-oriented
requirements engineering techniques for supporting business process management activities,
in particular business process modeling [67, 80]. While the process-oriented modeling
languages primarily focus on “how”, “what”, “where”, “who”, and especially “when”
questions, goal-oriented modeling focuses on answering “who”, “what”, and especially “why”
questions, hence offering a way to document intentions and rationales. A goal-oriented
approach leads the modelers to consider the opportunities that stakeholders look for and
vulnerabilities that they try to avoid, whereas modeling and answering “what” questions helps
identify capabilities, services, and architectures required to satisfy stakeholder goals [6].
Process mining constitutes a form of requirements engineering, from requirements
elicitation to validation. In particular, process discovery uses a form of crowdsourcing (where
logs are created from many system users performing tasks) and helps generate as-is process
models that reflect real behaviors recorded in logs. Process mining can be used as a data-driven
substitute or complement to use case / scenario elicitation. In scenario-based requirements
engineering [96], scenarios represent paths of possible behaviors through a use case, and they
are investigated to elaborate requirements. Scenario-based requirements engineering and
process mining both consider the question of how instance-level data can be generalized into
models, except that in process mining such data is orders of magnitude larger. Scenario-based
techniques (e.g., [91, 97, 98]), however, do not scale well to process mining-size problems.
Scenarios may be used to validate requirements, as ‘test data’ collected from observable
practice, against which the operation of a new system can be checked [97]. Conformance
checking is, also, a means to expose where the real process has deviated from a desired or
prescribed model. Process conformance and enhancement enable requirements engineers to
find new system/process requirements that will improve the alignment between current
practices and desired ones. When a deviation in an event log is considerably frequent, then this
situation should trigger the requirements engineer to re-assess the suitability of the system’s
goals and processes. The reasons behind frequent deviations could lead to new or modified

2
goals and requirements. This is where goal mining can contribute to the work of requirements
engineers, by supporting the discovery of goals from execution logs and improving the
soundness of goal models, requirements, and processes.
One limitation of using execution logs to reflect user behavior and as-is processes is that
the processes should be automated and instrumented precisely. However, when they exist, such
logs might be considered as a complement (or even substitute) to observations and interviews
with process participants.
One important research trend, aligned with increasing research interest on big data, is the
elicitation of requirements from massive collections of data [95]. Recent work on data-driven
and user-centered approaches in software requirements engineering [69, 51] are part of this
trend. These approaches [51, 69] focus on the identification, prioritization, and management
of software requirements based mostly on online feedback and reviews about software
products that users can easily submit in app stores, social media, or user groups. The current
review is however different from such approaches as it focuses on execution logs, processes
and goals.
Combining goals with processes leads to many known benefits. For example, Horkoff et
al. [48] recently reviewed 243 papers to assess how goal modeling can impact downstream
activities such as architectural decisions and process/system development. Many identified
relationships suggest that the use of goal models can have a major positive influence on:
• the content and structure of processes, based on business and tradeoff analysis;
• process quality, through measurement and compliance;
• decision making, business intelligence, and predictions;
• process alignment, configuration, adaptation, and evolution.
Process alignment with goals can also be checked to some extent at the structural level with
the help of consistency rules [2].
Relevant to the core of our work, Papadimitriou et al. [76] conducted a review to provide a
common reference point to the main concepts and challenges that should be defined when
someone intends to approach a data analysis task from a goal-oriented point of view. To this
end, that review covers goal-aware systems as a whole and does not focus on a specific area or
discipline. Goal-aware systems are able to perform three main tasks, all of which are related to
goals: i) goal model construction, to model and store goals, ii) goal inference, to identify the
goals given a set of observed actions performed by users, and iii) goal exploitation, to use the
inferred goals for adapting the system’s functionality according to the goals. In other words, a
goal-aware system requires the mechanisms to record the goal-related data, analyze it, and then
use the analysis results to recognize the actors’ goal, and respond accordingly [76]. That review
also establishes a common terminology and provides formal definitions for a number of
concepts such as environment, action, goal (soft/hard), and plan. Then, employing those
definition, the most common techniques for constructing models, inferring the goals, and
exploiting the results are presented.
In contrast to Papadimitriou et al. [76], who covered goal-aware systems at a high level
(with an emphasis on information retrieval) without getting into the details of specific
techniques, our review focuses on one type of goal inference approach, not covered in their
paper. Our review aims to assess existing work and opportunities where processes and their
goals are mined from evidence such as process execution logs. Considering that process mining
is the main field of research that focuses specifically on inferring knowledge from event logs,
we study the works related to goal-oriented process mining. Such assessment, which has not
been done in the past, requires an understanding of research that combines goal-oriented
concepts with process mining techniques. In such a context, a systematic literature review
(SLR) concerned with the intersection of both areas, can help provide insights into the above

3
objective. An SLR is a methodologically rigorous review of relevant research results [61]. In
such a review, systematic, explicit, and trackable methods are employed to search, identify,
select, appraise, and sensitize relevant research activities in order to answer some clearly
formulated research questions. This paper’s review is systematic, focuses on the research
where process mining and goal modeling are combined, and provides accurate insight about
the status of goal-oriented process mining activities. Such review is beneficial to requirements
engineers interested in obtaining evidence-based definitions of existing processes and their
goals prior to defining requirements for their modification or improvement.
The organization of this paper is as follows. The next section presents a brief overview of
process mining and goal-oriented modeling. In Section 3, existing papers related to goal-
oriented process mining are systematically collected according to the needs of two research
questions. This part highlights the absence of existing review related to goal-oriented process
mining, as well as the relative paucity of work done at the intersection of process mining and
goal-oriented modeling. Section 4 analyses the contributions of 24 relevant papers along three
categories: goal modeling, intention mining, and performance indicators. Section 5 synthetizes
and reflects on the selected papers, and presents the value added by this review. In addition,
Section 6 discusses several threats to the validity of this review. Finally, Section 7 provides
conclusions, including a discussion of the impact on research and practice, and highlights
important future work items for the requirements engineering community.

2 Overview of process mining and of goal-oriented modeling


This literature review aims to take into consideration the papers focusing on goal modeling
and process mining at the same time. First, an overview of these individual research areas will
be helpful to highlight the main concepts of each of these domains.

2.1 Process mining


Cook and Wolf [19] published the first article on process mining while they were working on
discovering process models from event logs in the context of software engineering [3]. Since
then, several groups focused on process mining models and developed different algorithms and
implemented tools and frameworks. As Ghasemi and Amyot [39] demonstrated in a recent
literature review, research in process mining has gained much interest over the past decade
with 466 unique papers published in 2015, up from 294 papers in 2014.
Process mining is providing some new techniques to infer valuable knowledge and insights
from event logs. For example, the electronic patient records in a healthcare system or the
transaction logs of an enterprise resource planning system can be used to discover models
describing processes, organizations, and products [100]. Moreover, such event logs can also
be used to compare reality with prior models to determine their mutual alignment [89].
In general, process mining activities are distinguished into three main types: 1) discovery,
2) conformance, and 3) enhancement (see Fig. 1). In the discovery type, the aim is to produce
a process model using event logs. There is no prior model involved in a process discovery
technique. The inferred model should be able to describe the observed behavior of a process.
In conformance, an existing process model is compared with observed behavior in the event
logs. If there is a deviation between the model and the event log, it can be further analyzed, for
example, to detect activities in the model that do not exist in the event log or vice versa [89,

4
supports
"World" People controls
Organisations Software System
Machines Components
Business Processes
specifies records
configures
models events
implements
analysis analyzes
1- Discovery

2- Conformance
Process Event
Model 3- Enhancement Log

Fig. 1 Three types of process mining: (1) Discovery, (2) Conformance,


and (3) Enhancement. Adapted from [100].

65]. In enhancement, the target is to change or improve an a priori model. Note that it is
supposed that a model, either produced manually or discovered, already exists [100, 101].
There are many process mining algorithms proposed by researchers and practitioners. Each
of these algorithms has different performances for different kinds of processes. As a result, it
is not easy to select an appropriate process mining algorithm for a given application
domain [111]. The α-algorithm [104] shows many of the general ideas used by many process
mining algorithms and aids to understand the concept of process discovery. However, the α-
algorithm is not a very practical mining technique as it has some problems with noise,
infrequent behavior, incomplete behavior, and complex routing constructs [100]. Heuristic
Mining algorithms [112] use a representation like graphs whose nodes capture activities and
arcs describe causal dependencies. These algorithms consider frequencies of events and
sequences when making a process model.
The ProM framework [83, 105], an extensible environment, is the most popular tool for
process mining. ProM is flexible in terms of the input and output formats, and allows reusing
code throughout the implementation of novel process mining ideas. Process mining capabilities
are also provided by a growing number of commercial analysis tools such as Disco [33].

2.2 Goal modeling


Goals are objectives that stakeholders and systems aim to achieve. Goals can be used at
different levels of abstraction. Goal-oriented requirements engineering is concerned with the
use of goals for requirements elicitation, elaboration, structuring, specification, analysis,
negotiation, documentation, and modification [5, 106]. Whereas processes excel at answering
“when” questions, goals answer “why” questions that are not handled by process models, and
hence both views are complementary.
A goal model is a directed graph that shows how goals contribute to each other [16]. AND-
refinement links relate a goal to decomposing sub-goals. Here the parent goal is satisfied if and
only if all sub-goals are satisfied. OR-refinement links relate a goal to some alternative
refinements. This means that satisfying one of the sub-goals is sufficient for satisfying the
parent goal. Contribution links between goals capture knowledge about the extent to which
goals positively or negatively influence other goals. In this context, a conflict between two

5
goals is present when the satisfaction of one goal may prevent the satisfaction of the
other [106]. In case a goal contributes negatively to another goal in the goal model and the
former one is satisfied, then the latter is unsatisfied. These kinds of rules are used for qualitative
reasoning about goal satisfaction, e.g., in the Goal-oriented Requirement Language (GRL) [5].
In the context of requirements engineering, one benefit of goal modeling is to support
heuristic, qualitative, or formal reasoning schemes. Moreover, one can verify that the
requirements entail predefined goals [106]. Goals and requirements are elicited from multiple
stakeholders with different viewpoints. Therefore, a wide range of conflicts can arise during
analysis [109]. In order to reason about conflicting goals, one of the major benefits of goal
modeling is to support trade-off analysis, targeting a solution that brings an appropriate balance
when not all goals can be satisfied simultaneously. Goal models are reputed to be efficient in
identifying and resolving conflicting concerns [109], and in producing robust requirements
through obstacle analysis [107, 110].
The goals’ intrinsic features such as their types and attributes, together with their links to
other goals and to other elements (e.g., containing actors, indicators, and requirements), make
a goal model [106].
Several classification axes and types of goals have been proposed in the literature.
Functional goals underlie services that should be delivered by the system. The functional goals
characterize the expected system’s behaviors, for example “Issue a receipt for every
transaction”. In contrast, non-functional goals talk about expected qualities of the system, for
example “the receipt should be easy to read”. Non-functional goals such as security, safety,
usability, flexibility, customizability, and so forth, are often captured with soft goals [6].
Several languages and notations have been developed to explicitly model goals and their
relationships by the requirement engineering community. For example, popular languages
include GRL [4], Keep All Objects Satisfied (KAOS) [108], MAP [87] the NFR Framework
[18], i* [116], the Business Intelligence Model (BIM) [47], and Tropos [14]. In 2008, GRL
was recognized as an international standard for goal-oriented modeling, as part of a
Recommendation of the International Telecommunication Union named User Requirements
Notation (URN) [50]. It is noteworthy that URN is a standard notation capable of dealing with
activity process models (in the form of Use Case Maps) and with goal models. URN also
supports Key Performance Indicators (KPIs) and other concepts to measure and align processes
and goals [82].
Another goal language, MAP, was proposed by Rolland et al. [87] to address a new
category of intentional process models in the context of business requirements engineering. As
MAP is the notation used in intention mining research activities (to be discussed in Section
4.2), it is important to explain that language here.
A process, while performing, is not limited to linear activities; actors, based on their
context, have a variety of choices for performing a task. MAP models direct the actor by
proposing dynamic choices aligned with their intentions. A MAP process model is a labelled
directed graph that consists of intentions as nodes and strategies as arcs between intentions.
The directed nature of the graph shows how the intentions can follow one another. Fig. 2 shows
a simple example of a MAP intentional model for buying a smart phone.

6
By comparing
By asking fiends several ideas
S1 S5
S4 By retraction
I1 By checking reviews
Start S2 Selection of a I4
smart phone S6 Stop
By shopping
I2
S3 in store
By failure
By consulting with a S7 S8
seller By the phone
By online in hand
shopping Buying the S9
smart phone
I3

Fig. 2 A MAP intentional model for guiding users who want to buy smart phones. There are
four intentions (I1-I4, including Start and Stop) and nine strategies (S1-S9) available to achieve
the intentions.

There are three key elements that make a MAP: Intention, Strategy, and Section. An
Intention is a goal that can be achieved by performing some activities. For example, there are
two intermediate intentions in Fig. 2: Selection of a smart phone, and Buying the smart phone.
A Strategy is a means to achieve an intention. For example, “By asking friends” is a strategy
to achieve the phone selection intention. Finally, a Section, is a triplet composed of a Source
Intention, a Target Intention, and a Strategy linking them. Therefore, a MAP consists of a
several sections, each of which being represented as a triplet <Is, It, Sst>. In our example, one
such section is <Selection of a smart phone, Buying the smart phone, By online shopping>.
In a MAP, there might be several alternative links from Is to It, each corresponding to an
explicit strategy (multi-thread flow) to reach It from Is. There might also be several strategies
from different intentions to reach an intention It (multi-flow paths to achieve an intention).
Finally, a MAP can contain reflexive flows, i.e., Is and It are the same intention. There are two
special intentions called Start and Stop, which represent respectively the intentions to start
crossing the map and to stop crossing it. As there can be several strategies that could be selected
between two successive intentions, there are usually many paths in the graph from Start to
Stop.
Rolland and Salinesi [88] have introduced and discussed goal/strategy MAPs as an example
of goal models conceived to meet the challenges faced by multi-purpose systems, i.e., systems
that incorporate variability in the functionality they provide and that can self-adapt to different
situations. Kraiem et al. [62] proposed a novel approach for mapping intentional process
models represented in MAP to business process models expressed in the Business Process
Model and Notation (BPMN) [73]. The proposed approach links the MAP elements to BPMN
elements in order to align the intentional and operational levels.

3 Literature review of goal-oriented process mining


This section aims to review the papers related to goal-oriented process mining. The studies
focusing on the combination of goal modeling and the capabilities of process mining
techniques are discussed. The methodology of this SLR, including the research questions, the
steps for selecting relevant work, the identification of search engines, the selection criteria, and
the inclusion/exclusion criteria are provided. Then, among the selected papers, those that are
related to the topic of this research are categorized and their contributions are summarized.
7
3.1 Research questions
Inspired by the work of Kitchenham et al. (2009), this systematic literature review aims to
answer the following research questions:

RQ1. What is the share of goal-oriented approaches in process mining studies? This
question aims to determine whether goal-oriented process mining is a mature
research field or whether there is currently a gap that exists.
RQ2. How are the studies conducted on the joint goal modeling and process mining
techniques categorized, and what are the main contributions of the papers in these
categories?

3.2 Identification of search engines


A recent literature review of process mining reviews [39] considered four of the most relevant
search engines in information technologies, namely Scopus, Google Scholar, IEEE Explore,
and Academic Search Complete (ASC), together with PubMed, Embase, Clinical Evidence,
and the Cochrane Library, which are four important healthcare search engines. That review
showed that 73.5% and 42.6% of the unique papers related to process mining within these eight
search engines could be returned by Scopus and Google Scholar, respectively. In addition,
95.6% are covered by at least one of these two engines and 19.7% of the papers are mutually
found by both. Therefore, if one simply uses these two search engines, he/she will miss only
about 5% of the papers, which were returned only by other six search engines (mostly IEEE
explore and ASC). However, to be on the safe side, in addition to Scopus and Google Scholar,
IEEE Explore and PubMed were used in this research. As this review targeted the papers
positioned in the intersection of process mining and goal modeling areas, the desired papers
are necessary positioned in the process mining area.

3.3 Keywords and queries


As the authors were looking for papers focusing on process mining and goal modeling, they
defined two sets of keywords, i.e., one for each area. Then, the following queries were defined
in all four search engines looking for papers that include at least one of the keywords from
each set:
(goal[Keywords] OR "goal-oriented" OR "Goal modelling" OR "Goal
modeling" OR "Goal model" OR "Goal mining" OR "Goal monitoring" OR
"requirements engineering" OR "user requirements notation" OR KAOS OR
istar OR "i-star" OR "i star" OR "NFR Framework" OR Tropos OR GRL OR
intention OR intentional OR "performance indicator" OR KPI)
AND
("Process mining" OR "Process discovery" OR
"conformance checking" OR "event log")

Note that this query was constructed iteratively, i.e., the authors started with some essential
keywords such as goal modeling, and then the returned papers informed them of other relevant
and related keywords such as intention and requirements engineering.
Initially, the query within the full text of the papers was run. This query returned thousands
of papers, but most of them were irrelevant. In order to end up with manageable, meaningful,
and relevant results, the search was limited to metadata (title, abstract and keywords). As
searching within abstracts and keywords is not supported by Google Scholar, queries on that
engine were limited to only the title (patents and citations were also excluded). Also, given
that the word “goal” is a popular word used in almost all paper abstracts (e.g., “the goal of this

8
paper is...”), the queries were limited to look for this specific word only within the keywords
(the other search terms were searched for title, abstract, and keywords). This constraint reduced
the number of results dramatically and enabled the authors to analyze the related papers while
avoiding noise from less relevant papers. To conserve the ability of finding as many papers as
possible, the review was not consciously limited to a particular time period.
The results of the query are shown in the first row of Table 1. Finally, after removing
duplicates among the results of the four search engines, 92 papers remained.
Table 1. Number of Papers Returned by the Search Engines

Google IEEE
Search within Scopus PubMed
Scholar Explore
Limited to
1 11 81 10 7
title/abstract/keywords
Duplicates removed,
2 92
across all engines

3.4 Frequency analysis of publications


In order to find the position of goal-oriented modeling in the process mining research area (and
vice versa), two more queries were run with Scopus looking for the aforementioned sets of
keywords separately. This time, the keywords were {"goal-oriented" OR "goal model"
OR "intention mining" OR "user requirements notation" OR KAOS OR istar OR
"i-star" OR "i star" OR "NFR Framework" OR Tropos OR GRL OR "Goal mining" OR
"Goal monitoring" OR "intentional modeling" OR "Goal modeling" OR "Goal
modelling"}, (this time only) within the indexed keywords of journal and conference papers.
KPI and "performance indicator" were omitted because these keywords independently do
not refer to the goal modeling field. Also, the same engine was queried run with the keywords
{"Process mining" OR "Process discovery" OR "conformance checking" OR "event
log"}. The numbers of papers returned through each query were 3048 and 2203, respectively.
These high numbers clearly show that the papers at the intersection of these areas of research
(92) are quite sparse.
In addition, the authors considered the frequencies of all keywords that have been indexed
by the 2203 papers related to process mining returned by Scopus. It is not surprising that the
keywords “process mining” and “data mining” are the top ones, by far. However, the word
“goal” did not show up within the top 160 keywords that were ranked based on frequencies.

3.5 Selection criteria and related papers


In order to exclude the papers that are irrelevant to our two research questions, exclusion
criteria were defined.
These criteria helped exclude papers that:
• only cited process mining and goal modeling or their derivatives to acknowledge
their existence or to mention them in passing in some examples; or
• were related to only one of the two areas considered; or
• were neither clearly related to process mining nor to goal modeling; or
• were not written in English.

The aforementioned criteria resulted in 24 papers (including one thesis) in which a research
activity related to the topic of this review was reported (see Table 2). A quality assessment
confirmed that the papers (especially those returned by Google Scholar) had sufficient
scientific content and were not coming from predatory journals or conferences. In the
following section, these specific papers will be reviewed.

9
Table 2. Selected Papers Related to Goal-oriented Process Mining, with Citations from
Google Scholar on December 21, 2018

Proposed
Goal Modeling & Requirements Elicitation Category

Makes Goal model Case study/


Paper Code Year Source Cited goal mining
reference to representation evaluation?
algorithm?

[8] P1 - 2012 Conf: ICITCS 0 None No No


[7] P2 - 2012 Conf: ICCSA 0 None Yes Yes
[115] P3 - 2014 Journ: TMIS 3 AND/OR graph Yes Yes
[45] P4 - 2015 Conf: SEKE 0 AND/OR graph Yes Yes
[81] P5 P12 2015 Conf: BPM 3 KAOS No Yes
[75] P6 - 2016 Journ: SoSyM 9 None No Yes
[11] P7 - 2017 Conf: BPM 4 None No No
[92] P8 - 2017 Conf: ER 0 AND/OR graph Yes Yes
[22] P9 - 2017 Conf: RE 1 None No Case study
[114] P10 - 2017 Journ: J Healthc Eng 1 None Yes Yes
Mentioned and compared
[52] P11 P12 2013 Conf: RCIS 6 Yes
some of them
[55] P12 P13, P11 2013 Conf: RCIS 23 MAP Yes Yes
Mentioned and compared
[54] P13 P11, P12 2013 Conf: EMMSAD 21 Yes
some of them
[58] P14
Intention Mining

P11, P12, P13 2014 Conf: RCIS 3 MAP Yes Yes

[24] P15 P11, P12, P13 2014 Journ: IJISMD 7 MAP Yes Yes

[57] P16 P11, P12, P13 2014 Conf: MSR 17 MAP Yes Yes

[56] P17 P12, P16 2014 Conf: BPMDS 11 MAP Compare Yes
P11, P12, P13,
[53] P18 2014 Thesis: U. Paris-Est 2 MAP Yes Yes
P14, P16, P17
[29] P19 P12 2014 Conf: CAiSE 6 MAP Yes Yes
P14, P15, P16,
[59] P20 2015 Journ: IJISMD 3 MAP Yes Yes
P17, P18
[63] P21 - 2013 Conf: CBI 9 Performance analysis Yes
[64] P22 P21 2014 Conf: CBI 2 Performance analysis Yes
KPI

[23] P23 - 2017 Conf: OTM 3 Model redesigning with KPIs Yes
[17] P24 - 2017 Journ: DSS 2 Performance analysis Yes

4 Analysis of selected papers


After considering the selected papers and clustering their main subjects, the papers were
categorized into three main areas: Goal Modeling and Requirements Elicitation, Intention
Mining, and Key Performance Indicators (KPI). In this section, the selected papers are
analyzed and their contributions are summarized.

4.1 Goal modeling and requirements elicitation


Business processes are designed to achieve certain goals, namely process goals. Based on these
process goals, individual agents are assigned to perform the related tasks composing the
process. However, the agents generally belong to different sections/units and have different
interests, and their behavior is not usually observable. In the real world, agents have intentions
reflecting their own interests, namely agent goals. In order to satisfy these goals, the agents
10
may not always behave according to a designed process. The inconsistencies or, in some cases,
conflicts between process goals and agent goals often result in a reduced efficiency or
effectiveness of the executed process [115]. However, traditional business process
management ignores the feature of self-interest in the business process and adopts an inter-
organization workflow perspective [99]. This phenomenon is common in the context of agent-
oriented business processes, which have been the focus of the literature in some research
domains such as healthcare [1], supply chain management [9], financial fraud
management [60], and others. Process mining could have a valuable contribution in addressing
this common problem. However, in the literature, there is a lack of studies on how to discover
agent goals in business processes. To fill this gap and to address that problem, Yan et al. [115]
proposed an agent-oriented goal mining approach for modeling, discovering, and analyzing
agent goals in executed business processes using event logs. This is clearly positioned in the
context of process discovery as a type of process mining activity.
Yan et al. [115] used the classical goal reasoning framework of belief-desire-intention
(BDI) logic [35] to analyze an agent’s choices using event logs with domain data from the
business process. They have explained some supporting reasons to show that the BDI model
has several advanced features that meet the requirements of an agent’s goal mining, compared
to other computational decision models. Yan et al.[115] have used the framework shown in
Fig. 3 within a sample process used to invite external reviewers to evaluate loan requests.

Event Log Domain Database


Case ID Activity Originator Timestamp Dtat
Data Collection … … … … …
Agent's information ...

Process Model Decision rule Goal


Agent's
Process Decision Goal goal
Model discovery Miner Miner Decision
Miner
Point

Prior studies

Business goal of
the process
Goal Goal Goal
Reviewer 1

Assign Evaluate Issue


reviewer claim payment

Business Analysis 3
2 Goal 1 Goal 2
Proposal Proposal Proposal
quality quality quality
sub-goal sub-goal
Affiliation Free time Free time Affiliation Free time

Fig. 3 Yan and Hu’s framework proposing an agent goal mining approach. Adapted from [115].

Three activities are executed in the process. First, an agent assigns a reviewer for each loan
request. Then, a reviewer performs the activity of evaluating the request, and finally the request
is either assigned to a new reviewer or sent to the payment agent. The authors have elaborated
this example, which has been supported by some event logs used to mine the agent goals
(number 3 in Fig. 3).

11
In their proposed goal mining algorithm, the goals are expressed as decision tree rules. In
order to discover an agent’s goal, Yan et al. [115] adopt an approach to discover the agent’s
desire-accessible worlds based on the agent’s previous activities. They assume that if an agent
has chosen an activity, she knows the effects of the activity and believes that she achieves her
goal by performing this activity. Thus, one can learn the agent’s goal by classifying the agent’s
activities in different situations. In other words, the agent’s desire-accessible worlds can be
identified by machine learning methods. Yan et al. [115] adopted the decision tree algorithm
(C4.5) to construct the desire-accessible worlds, i.e., the agent’s goal, within their example. As
inputs, their proposed algorithm needs event logs and the agent-oriented Petri net process
model discovered by process mining algorithms. The outputs of the algorithm are the agents’
goals at each state.
Yan et al. [115] have evaluated their framework by simulation techniques and compared
their goal mining-based approach with the best-performer-based approach, using t-test for
testing the improvement hypothesis. Based on their case study, in terms of improving the
effectiveness (=1 – Number of requests wrongly graded / Number of requests), their goal-
mining-based approach has not significantly improved effectiveness, comparing with the best-
performer-based approach. However, their approach has significantly improved efficiency
(=Number of requests / Evaluation times).
Along with the goal-oriented approach of Yan et al. [115], Armentano and Amandi [7]
focused on the automatic recognition of the goal that motivates an employee (agent) to perform
a particular sequence of tasks. This is crucial to determine what tasks are expected to be
executed next in order to achieve the goal within the dynamics of the organization.
Traditional process discovery techniques [20] do not focus on detecting the goals that
motivate actors to do their tasks. Nevertheless, the line of work of Armentano and Amandi [7]
corresponds to the area of goal recognition. In this direction, they considered the prediction of
the subject’s goal as an inherently uncertain task. Therefore, they looked for a knowledge
representation able to capture and model this kind of uncertainty.
The Markovian approach, proposed by Cook and Wolf [19], is one of the traditional
approaches for process dioscovery. The approach presented by Armentano and Amandi [7]
differs from the Markovian approach in that the latter one used Markov chains of first and
second orders, while the former one used variable-order Markov models.
In this proposed approach, the model is constructed from the observation of the tasks that
an employee executes and from the goal that motivates the execution of those tasks. The model
predicts the most probable agent’s “class” (goal) after each performed activity. A learning
algorithm is used to make a model of the tasks necessary to achieve different goals. This
knowledge could be preserved in the organization, contributing to the organizational memory.
Finally, this can contribute in detecting the goal of an employee at any time and can help
identifying deviations from expected behaviors [7].
Normally, any industry-scale business process takes different variants that ideally
contribute to achieving the goals of an organizational goal model. Consequently, the adequate
management of process models with many variations mandates considering each process
variant as an independent model entity. Ponnalagu et al. [81] worked on the admission of a
deviating process instance as a valid variant of the intended process that achieves the same
goals as the intended process. They focused on concept and goal conformance drift that rises
with evolutionary changes in process executions. This could be admitted as a crucial aspect
especially in knowledge-intensive processes, where significant drifts from valid (goal aligned)
workflows could happen due to manual errors and environmental constraints.
There exist studies that worked on the categorization of processes [70], discovering a
family of process variants from a collection of event logs [15], and on the management of large
collections of process variants of a single process model [42]. In their approach, Ponnalagu et

12
al. [81] validate and categorize the discovered family of variants based on a goal model. Their
approach can leverage the works that focus on enriching process designs with goal-driven
configurations [65]. Categorizing process execution deviations in a goal-based fashion is
necessary to decide if a deviation represents a valid variant. Ponnalagu et al. [81] proposed an
approach to help with the decision of whether process instances in execution logs are valid
variants, using a goal-based notion of validity. Their approach also supports the analysis of the
impact of contextual factors in the execution of specific goal-aligned process variants. They
used the KAOS methodology for goal decomposition. Moreover, they examined their approach
with an Eclipse-based plugin and, to validate their approach, they considered a process log of
25,000 events obtained from the history of a help desk division in an IT organization. They
demonstrated that with semantic annotations of process models, goal models in terms of the
end effects are accurately applied for a given domain. Also, in terms of being the right
candidate for dealing with a larger data set composed of processes, their approach is general
enough, i.e., it is not restricted to goal models or process models of any scale or domain.
As explained previously, conformance checking is acknowledged as one of the main
categories of process mining activities, where the real executed process is compared to the
process model in order to find and analyze deviations. There are many conformance checking
approaches mainly focusing on aspects of business processes, but this is not adequate for
analyzing whether the actual business processes are able to satisfy organizational goals.
According to Horita et al. [45], goal-oriented requirements analysis methods received
notable research interest in requirements engineering. This approach is useful for reflecting
organizational requirements into business process models, but actual business processes
usually deviate from defined process models. Hence, it is important to analyze the data

Goal Traces
Traces
Traces

Violated Combining
Goal Goals
Trace & Goal Filtering Constructing cross
Processing Phase Traces Tabulation Table
Traces

Goal Model
Cross
Goal Tabulation
Table

Significant
Statistical Difference Statistical
Analysis Phase Calculating Effects detected Goals Hypothesis Testing
Combinations

Fig. 4 Overview of the goal-oriented conformance checking method proposed by Horita


et al. Adapted from [45]

13
recorded as event logs along with using the model’s information. Horita et al. [45] proposed a
goal-oriented conformance checking approach that not only can detect deviations between logs
and models, but can also analyze the effects of a deviation.
As shown in Fig. 4, the method proposed by Horita et al. [45] consists of two main phases.
The first phase, called Trace & Goal Processing, checks whether event logs satisfy the goals
of a goal model represented as logical formula, and constructs cross-tabulation tables for the
second phase. All traces are divided into goal satisfied traces or not satisfied traces. The
numbers of traces when a goal is or is not achieved compose these cross-tabulation tables. The
second phase, called Statistical Analysis, considers the relation between the goals and analyzes
the positive effects or negative effects of a violated goal on the other goals in the goal model.
This phase uses statistical hypothesis testing on cross-tabulation tables constructed in the first
phase as an input, in particular Chi-squared test and Fisher’s exact test. The authors have
applied their approach on an event log of a phone repair process.
In [11, 12], Bernard and Andritsos focused on Customer Journey Mapping (CJM) as an
emerging research area. In particular, CJM tracks and describes customers’ responses and
experiences when they are using a service and represents them, as customer journeys, on a
map [11]. The challenge with CJM is that it is not clear how to present thousands of customer
journeys. Inspired by process discovery techniques, borrowed from process mining, Bernard
and Andritsos developed the CJM-explorer (CJM-ex) tool. This tool is a web interface that
enables interactive navigation through several journeys stored in event logs. To this end, the
application uses hierarchical clustering, statistical indexes, and user-defined goals. CJM-ex
aims to show how the event logs can be displayed on CJMs, and to let analysts navigate through
these journeys in a goal-oriented fashion.
In the proposed method, a hierarchical clustering algorithm is first used to segment the
original customers’ traces into groups of similar journeys. After forming the clusters, CJM-ex
becomes able to leverage the contextual information (e.g., emotions or characteristics)
associated with a typical customer journey. The clustering algorithm generates a tree (or
dendrogram) where the journeys seen in event logs are at the leaf level and get merged at higher
levels to form “representative” journeys.
From a goal-oriented perspective, this method does not deal with customers’ goals or
intentions. However, it lets the users of the tool (e.g., analysts) define their own exploration
goals. For example, an analyst might be interested in journeys that started by a specific activity
(e.g., “calling the customer service”) experienced by customers with specific characteristics
(e.g., people younger than 16). Based on the goal setting, important branches of the clustering
tree will be highlighted in “hot” areas of the tree. This feature makes CJM-ex the first goal-
oriented tool that enables analysts to set a-priori goals to guide journey exploration.
As explained in Section 2.2, goal models play an important role in requirements
engineering by representing statements of stakeholders’ intent in a hierarchically form. Here,
the goals in higher levels of the hierarchy (parent goals) are related to goals in lower levels
(sub-goals) via AND or OR refinement links. In that context, Santiputri et al. [92] aimed to
address the question: can enterprise goal models be mined from readily available enterprise
data? Note that the method they proposed does not consider mining goals directly from event
logs. Rather, it aims to leverage operational data that manifest pre-deployed goal refinements.
In summary, this method involves mining event logs to leverage so-called temporal correlation
patterns between goals and sub-goals, mining goal refinement patterns from multi-layered
event logs, and composing goal refinement patterns to obtain goal models (or goal trees).
Temporal correlation patterns indicate that the achievement of a goal relates to the achievement
of its sub-goals. Such a pattern requires that in a sequence of events, the events denoting
achievement of a parent goal occur immediately (or soon) after the events denoting the
achievement of its sub-goals. For example, the event that denotes occurrence of “making a cup

14
of tea” (as a parent goal) occurs immediately after the events that denote “putting a teabag in
a cup” and “pouring of hot water into the cup” (as its sub-goals).
In this approach, it is critical to assume that the input event logs are partitioned into different
levels of abstraction. This key assumption suggests that the events implying achievement of
parent goals appear in the log of more abstract events (or higher-level), while the events
implying achievement of sub-goals appear in the log of more refined events (or lower-level).
The next step is to compose goal refinement patterns to obtain goal models. Here the main
challenge is the difficulty in relating semantically similar, but syntactically highly dissimilar
requirements of goals and sub-goals. For example, a sub-goal may be represented as: log labor
hours for billing. Pretty distinctly, we might mine a goal refinement pattern for a parent goal
represented textually as: track technician time for charging the customer. Human intuition
advises that these two goals are semantically similar, and any existing know-how for the latter
is also useful for the former. To cope with this challenge, Santiputri et al. use a state-of-the-art
mechanism for measuring syntactic and semantic similarities of words and phrases, called
word2vec [72]. This mechanism consists of a two-layer neural network that returns a real-value
measure of semantic similarity of two given words (the higher the value, the more similar).
Hence, using an appropriate threshold defined by domain experts, one can relate a phrase
describing a sub-goal in one goal refinement pattern to a phrase describing a parent goal in
another goal refinement pattern. This approach does not guarantee the accuracy of all mined
goal models and hence analysts’ oversight and editing are still required. This method helps
modelers improve their efficiency as they can use the “first draft” models or model fragments
(instead of a “blank sheet”) to design usable models.
This approach was evaluated using both real-life and synthetic datasets. Overall, the results
of the empirical evaluation suggested that the combination of two proposed techniques
(sequential pattern mining for leveraging temporal correlations patterns and word2vec for
evaluating goal/sub-goal similarity) is a promising basis for goal model mining.
In [22], Dabrowski et al. took into consideration a common problem of software designers
during the elicitation of software requirements, and propose a solution based on a goal-oriented
process mining approach. Software designers have to make assumptions (often invalid) about
the users’ processes and consequently about the requirements supporting such processes.
Elicitation and validation of such assumptions manually through interviews, observations, etc.
is time consuming and expensive. Furthermore, designers may fail to find the users’ real
processes. To address such a problem, Dabrowski et al. [22] proposed a new approach that
leverages the synergy between Requirements Engineering, Crowdsourcing and Process
Mining. Such a combination is employed to identify and validate software process
requirements exploiting the crowd’s perception [40, 93]. This approach takes advantage of the
capabilities of process mining to discover the underlying processes of crowds from event logs.
A crowd brings an opportunity to involve large and diverse groups of users to interact with a
given system and conduct their intended processes leading to achievement of their goals. Then,
process mining techniques can discover such underlying processes from the logs stored in
databases. The discovered processes become implicit feedback used for revising the existing
system’s functionalities and for ensuring that a software system operates as expected. In
addition, missing functionalities and quality issues may be revealed by the analysis of user-
system interactions.
The novelty of this approach pertains to its ability to extend requirements engineering
techniques by revealing new requirements using process mining techniques along with
crowdsourcing. This is different from other crowdsourcing techniques in the literature that are
mainly focused on extracting explicit user feedback [40, 69].
The work of Dabrowski et al. [22] aims to answer the question “how to discover
requirements through goal-driven process mining?”. They explained their ideas through a

15
simple example (a personal energy management system) as a case study and illustrated the
motivation behind the approach. First, end-users execute their daily processes using the
application for a given period of time. Then, the corresponding usage log is input to process
mining algorithms to extract the underlying process models. The purpose of this process
mining step is to reveal typical patterns of user behavior, process bottlenecks, and
misalignments or variants between reality and intended nominal processes. This typically
benefits a software engineer that has problems detecting discrepancies between real processes
and designed ones [90]. Such a benefit is possible by conformance checking and denoting the
points in the process not aligned with the high-level user goals [21]. This step results in a list
of possible new requirements to address misalignments between designed and actual processes.
Further analysis here could lead to the identification or/and confirmation of new stakeholder
goals and underlying requirements. The resulting artifact then forms an input for the planner
who is responsible of identifying tasks and objectives to be delegated to the crowd. In the
proposed approach, “crowd” denotes a large and diverse group of users delegated to perform
some processes using the given application to accomplish predefined objectives [40]. The
planner then orchestrates objectives and tasks to be carried out by the crowd.
The crowd interactions with the software are similarly stored as logs and subsequently
analyzed. If a misalignment is still present, then this may lead to the implementation of newly
found requirements in the application. The synergy between crowd and process mining
techniques can indeed become an important source for new requirements.
The work of Dabrowski et al. [22], which exploits users’ real behaviors for detecting
misalignments, can effectively contribute to requirements engineering and, also, to process
improvement. This contribution is promising not only in software systems but also in
organizations striving to improve compliance and design better processes.
In this regard, Outmazgin and Soffer [75] proposed a process mining-based analysis to
study intentional incompliances, where employees intentionally deviate from the required
procedures even if they are aware of them. Understanding intentional incompliances, or work-
arounds, can improve an organization’ performance by helping redefine requirements and
redesign processes and support systems.
There exist numerous research activities about compliance conducted separately within the
communities of requirements engineering [36, 71, 117] and process mining [68, 98, 27].
Existing process mining techniques for compliance checking do not distinguish intentional
incompliance and do not address their reasons and sources. In contrast, Outmazgin and
Soffer [75] defined a list of generic work-around types found in real practices. They discussed
whether and how these work-arounds can be detected by process mining, and they analyzed
their sources.
A work-around denotes the employees’ perception of prescribed procedures as an obstacle
for some of their desired goals, when intentionally not following these procedures. Work-
arounds are usually recognized as a negative phenomenon, assuming that the prescribed
process is designed and optimized to reach the desired business goals. However, since work-
around are intentional behaviors of employees, they are performed for certain reasons.
Poelmans [78, 79] reminds that users work around a system for the purpose of saving time
and/or efforts or avoiding the system’s limitations. Outmazgin in [74] shows that work-arounds
may be triggered when the predefined business processes are not able to accommodate atypical
situations that may arise. Moreover, work arounds can be motivated when the designed process
or its support system cannot satisfy all the stakeholder expectations. Other cases might occur
when employees decide to pursue their own personal goals rather than to follow the process
designed based on the organization’s goals.
The qualitative study of Outmazgin [74] that explores work-arounds in several
organizations led to six generic types: A) bypass of process parts, B) selection of an entity

16
instance that fits a preferable path, C) post-factum information changes, D) incompliance to
role definition, E) fictitious entity instances, and F) separation of the actual process from the
reported one. Then, in [75], Outmazgin and Soffer investigate whether and how the
workarounds can systematically be revealed using event logs. To this end, they have used
Disco [33] and explored the logs of five processes belonging to three organizations over two
years. They further characterized the log patterns that can be used to detect work-around of
types A, C, D, and F. Also, they explained why such pattern cannot be defined for the other
two types (B and E). For example, an experienced employee splits purchase requisitions and
place several small requisitions, instead of a high-priced one, to avoid long approval paths [75].
Next, Outmazgin and Soffer [75] used process mining to conduct a quantitative analysis of
situational reasons that can be associated with work-arounds. Using the data of real processes,
they have statistically tested the hypothesis of a correlation between different types of work-
around and activity durations, number of participants, and work handover. For instance, they
found that work-around of type A (bypassing some activities), as the most frequent type, is
positively correlated with activity duration and with the actual number of activity participants.
Aligned with these results, a case study in Belgium [79], also suggested that the sources of
work-arounds can be mitigated through the following strategies: deliberate efforts to decrease
specific activity times, improving control and designing appropriate work handovers,
enlightening access permissions to prevent unauthorized employees doing activities, and
monitoring work-arounds and performing disciplinary actions. These practices are expected to
lead to performance and compliance improvement.
One of the principal domains where process mining capabilities are leveraged is
healthcare [39]. In particular, clinical pathway (CP) mining from historical data has received
increasing attention. Some approaches focus on using topic modeling methods to discover
clinical patterns instead of process models. The topics are semantic compositions of clinical
activities often discovered using Latent Dirichlet Allocation (LDA), a statistical model
developed for text mining and machine learning first presented by Blei et al. [13]. Xu et
al. [113] used topic modeling together with process mining to generate a concise and
interpretable topic-based process model. The key principle in their work is to discover the
latent topics from clinical data using LDA and, then, to extract the order relations between
these topics using process mining techniques. In this approach, treatment activities are mapped
to words and the latent clinical topics are mapped to latent topics of texts.
Xu et al. [114] further aimed to consider those latent clinical topics as clinical goals. They
assumed that the clinical activities taking place in a specific time interval (usually a
hospitalization day) are prescribed for multiple clinical goals, and each clinical goal
corresponds to a set of clinical activities. Finally, a process discovery framework is applied on
the goal-based sequences (instead of the detailed clinical activities) to get a comprehensive
process model [114]. In order to evaluate the proposed approach, they used two real-world
datasets. The results show that the topics revealed by their method provide higher coherence,
informativeness, and coverage than the raw LDA. These extracted high-quality topics are
proper for representation of the clinical goals. Also, the method is effective in generating a
comprehensive topic-based clinical pathway model.
Baek et al. [8] focused on the adaptability of the processes discovered by process mining
techniques in terms of being dynamically reconfigured for new requests made on business
processes. They proposed an analytical method using a heuristic algorithm. This algorithm is
based on the goals to create an adaptive process mining model. The model is able to provide a
continuous service demand scenario that is created dynamically. Baek et al. proposed some
measurements to calculate the correlation among the activities or the importance of each
activity within a given process and the similarity of some processes. They took advantage of
concepts analogous to basic ordering relations determined by the α-algorithm [66]. They also

17
used the concept of footprint matrix, also known as order matrix. Their preliminary study was
mainly aiming some further studies that propose a process-based goal scenario. This approach
could be a basis to set up strategies allowing rapid dynamic reconfiguration to meet demanded
requirements. This paper did not involve any standard goal-oriented language in conjunction
with a process mining technique.

4.2 Intention mining


As described in Section 2.1, process mining techniques mainly focus on event logs, discovered
models, and predefined process models to extract process-related insights [102]. Hence,
process mining has been growing into an activity-oriented approach [25].
A research direction called intention mining has however recently emerged. Intention
mining shares objectives similar to process mining’s but here the main goal is to discover
intentional process models, beyond activity process models. Intentional process models
consider the intentions underlying activities rather than the activities themselves. Aligned with
goal modeling characteristics, the reasoning behind the activities are specifically addressed by
intentional process models. This research field is developed by taking into account the notions
of intention and strategies of the process enactment [52, 54, 55].

4.2.1 From intentional process modeling to intention mining


While intention mining is quite a young field, intentional process modeling (with MAP) as a
basis for intention mining goes back to the late 90s. According to the trend of this research
field presented by Epure et al. [28], many studies in intention-oriented process modeling
proved that the fundamental nature of processes is mostly intentional and hence the process
should be modeled from an intentional point of view. In the regular research trend of intentional
process modeling, event logs have however been neglected. This neglect is the motivation for
research in intention mining. The main challenge of these studies is to identify and formalize
intentions from event logs [28, 52, 53, 54, 55].

4.2.2 Process mining versus intention mining


Khodabandelou et al. [54] compare two process mining and intention mining approaches by
applying process discovery techniques to find a process model from a case study’s dataset of
traces and discuss the results. To this end, the α-algorithm [66] and the Strategies Miner
algorithm [54] are applied in process mining and intention mining approaches, respectively.
Khodabandelou [52] has elaborated some aspects of intention mining and described its
similarities to and differences from other fields of research such as process discovery
approaches in process mining, goal-oriented approaches in goal modeling languages, machine
learning approaches in process mining, and recommendation techniques. She emphasized that
the process discovery approaches discover process model from sequences of actors’ activities,
whereas intention mining relies on the intentional aspect of the processes to discover the
process model. The intentions are defined as the motivation to achieve the goals. The behaviors
of users depend on their intentions and all of the intentions are not detectable. It is important
that the intentions can be modeled at different levels of abstraction. A model of high-level
intentions represents parent intentions and a low-level model shows sub-intentions. The
fulfillment of a parent intention depends on the fulfillment of its sub-intentions.
Also, the objectives, representation languages, algorithms, and tools of both fields are
briefly introduced by Khodabandelou et al. [54]. As described in Section 2.1, there are many
process mining algorithms and some concrete tools based on them, but the algorithms of
intention mining are yet limited to the hidden Markov Model (HMM) method (described later).
Furthermore, there is no intention mining tool that extracts intentional process models and
infers goals from activities.
18
4.2.3 Language selection for intention mining activities
In order to choose a goal modeling notation to be used in intention mining, Khodabandelou et
al. [54] compared three goal-modeling approaches (i*, KAOS, and MAP). The approaches
were compared in terms of their variability for goals, rigidness of task-decompositions, and
their operational semantic of tasks. They rejected i* because, “i* is not designed to be a
variable framework therefore, it does not afford a high level of flexibility” [54]. However, this
argument deserves to be challenged since i* takes intentional and strategic views and supports
representation of variability through means-ends relationships, enabling to perform both
“upward” and “downward” strategy evaluation and to analyze the types of variability [116].
KAOS was also rejected because it has a rigid task-decomposition mechanism, and hence
modeling complex intentional processes would be difficult. In addition to the aforementioned
modeling language, Khodabandelou [52] has considered GRL and Tropos as modeling
languages and did not choose them as they are based on i* and thus were considered to have
limitations similar to i*’s.
MAP (introduced in Section 2.2) was chosen because this notation represents task and
strategies in a labelled directed graph (aligned with the intention mining techniques) and allows
formalizing flexible processes [24, 52].

4.2.4 Hidden Markov models as a basis for intention mining


One cannot easily understand HMM-based intention mining methods unless he/she has a good
knowledge of HMM, e.g., as tutored by Rabiner [85]. Addressing this important required
knowledge, an introduction to HMM is presented here as a prelude to its usage in the upcoming
description of existing intention mining approaches.
A hidden Markov Model (HMM) is a statistical model that consists of two main
components: states and observations. The state of the model in any given time generates an
observation in that time. For example, in a meteorology context, states are the weather types
(sunny, raining, etc.) and the possible observations generated with the states can be walking,
shopping, staying at home, etc.
Fig. 5 shows a simple HMM with three possible states and four possible observations. At
a given time t, the statet is a member of the set of states called S. In Fig. 5, S={s1, s2, s3}.
Statet depends on the previous state, at time t-1; the system goes from statei to statej with a
probability called Tij. Note that the sum of the probabilities of a given state’s outgoing
transitions is 1. For example, in Fig. 5, T12=0.7, T11=0.3, and T13=0 (0 probabilities are not
shown in such models). Therefore, in state s1, the probability of moving to state s2 is 0.7 while
it is 0.3 to state s1.
0.7 0.2
0.9 si state

0.3
s1 s2 s3 oi observation
0.8 0.1 0.2 transition with
0.3 0.7 probability
0.6 0.4 0.3 0.2 0.5
0.5 observation
probability
o1 o2 o3 o4

Fig. 5 A hidden Markov model with three states and four observations

Visiting a state at a given time can generate an observation from set O, with a given probability.
In our example, the set of possible observations is O={o1, o2, o3, o4}. Let Eik be the
19
probability of generating observation k in given state i. In Fig. 5, state s1 can generate
observations o1 with a probability of 0.6 (E11) and o2 with a probability of 0.4 (E12). Let the
numbers of possible states and observations be N and M, respectively. Accordingly, there are
two matrices that represent all Tij and all Eik, respectively called transition (T: N×N) and
emission (E: N×M) matrices. For instance, the transition and emission matrices for the example
in Fig. 5 are:

0.3 0.7 0 0.6 0.4 0 0


T=[0.8 0 0.2 ] E=[ 0 0.3 0.2 0.5 ]
0 0.1 0.9 0 0 0.3 0.7
In addition, the vector π contains the probabilities of each state to be the initial state. In our
example, π=(0.9, 0.05, 0.05), which means that the system starts from s1 with a probability of
0.9 (the π values are not shown in the figure). The transition and emission matrices, together
with the initial vector, capture the entire model and can be formalized as a triplet, called
λ=(T; E; π). The main characteristic of HMM is that the states are not observable and remain
hidden whereas the observations are the only things visible to users. Consequently, there are
three key and pragmatic problems that should be solved, where A is a sequence of observations,
e.g., A=(o1,o1,o2,o4,o3,o3):
• Problem 1: given the sequence A and the model λ = (T;E;π), what is the probability
of A, i.e., Pr(A| λ)?
• Problem 2: given the sequence A, what is the optimal sequence of states that could
generate A?
• Problem 3: given the sequence A, how can we estimate the model parameters, λ =
(T; E; π), that maximize Pr(A| λ)? Here, the transition matrix, the emission matrix,
and the initial vector of probabilities need to be found.
There exist mathematical algorithms to solve the aforementioned problems. The Forward
Algorithm is the main one for solving the firs problem. The second problem can be solved with
the Viterbi algorithm, which finds the state sequence with the maximum likelihood, given the
observations sequence. The third problem is however the most difficult one, as there is no
known analytical way to find the most likelihood model. However, there is an iterative method,
known as the Baum-Welch Algorithm (BWA) [10], that can find a local maximum likelihood
estimate of the parameters given a set of sequences of observations [84].

4.2.5 Map Miner Method (MMM), unsupervised and supervised


The Map Miner Method (MMM) aims to infer a MAP intentional process model from event
logs. If a prescribed MAP already exists, an analyst can compare the discovered MAP with the
prescribed one and study the compliance level and deviations. In this sense, the MMM
approach shares similarities with process discovery techniques in process mining. The output
of process discovery is a discovered process model in the level of activity, while MMM ends
up with a model in the intention level. Intentional MAP can be inferred through either an
unsupervised or a supervised scheme. Here, we describe these two methods, which are
illustrated in Fig. 6.

20
Inputs:
Unsupervised
BWA
<a,b,d,e>3 Sequences of
<a,b,b,c,e>7 Algorithm
activities and
the number of
strategies (|s|) Pseudocode
Deep Miner
E, T, π Algorithm MAP
Threshold (ε) ε

Number of clusters (k)


k
Counting
MAP Miner
transitions
<a,b,d,e>3 Sequences of Algorithm
S1,S2,S4,S3 activities and
sequences of
Supervised
<a,b,b,c,e>7 strategies
S1,S2,S2,S3

Intentional
Output: MAP

Fig. 6 The supervised and unsupervised Intentional Map Miner Methods, with their inputs,
algorithms, and outputs. The red items belong to the unsupervised method only, and the green
items to the supervised method. The black items are part of both methods.

In the work of Khodabandelou et al. [57, 58, 59] the focus is on unsupervised methods. They
took advantage of HMM to propose a MMM framework to find an intentional MAP from event
logs. In this framework, the hidden states are strategies and the observations are users’
activities. Since activities are generated by users’ strategies the activities that are observable
in event logs would be generated by the strategy that the agent has adopted to achieve an
intention. Therefore, the sequence of agents’ strategies will generate the sequences of activities
stored in the event log. For example, as Fig. 7 shows, the trace of activities <a,b,b,c,d>
observed in event logs could be generated by different traces of strategies such as
<s1,s1,s2,s2,s3> or <s1,s2,s2,s2,s3>.

si Strategy

s1 s2 s3 Activity
Strategy transition
with probability
Activities generated
by strategy
a b c d

Fig. 7 A sample hidden Markov model that consists of strategies as hidden states and of
activities as observations.

Generally, each sequence of activities can be generated by different sequence of strategies with
respect to the transition and emission matrices. As Fig. 6 shows, in the unsupervised MAP
Miner Method, the traces of activities are used as inputs. Although the strategies are some
groups of activities, here, there is no previous knowledge about the strategies and the

21
segmentation of activities’ sequences into strategies. Therefore, in addition to the transition
matrix (going from one strategy to another one), the emission matrix (generating an activity
by a strategy) is also unknown. As the activities in the traces are not labeled by corresponding
strategies, this method is called unsupervised. Here, the observations are known and the T and
E matrices should be estimated. The HMM parameters are estimated using the Baum-Welch
Algorithm (BWA) [10], the most frequently-used algorithm for learning the parameters of a
HMM (discussed in Section 4.2.4).
In addition to the observations, the BWA algorithm also needs the number of states as input.
Therefore, the number of strategies should be defined by an expert in the unsupervised method.
The main outputs of the BWA algorithm are the transition and emission matrices (see Fig. 6).
The transition matrix shows the probability of moving from one strategy at time t to another
startegy at time t+1. Once T is estimated, the problem becomes how to extract a MAP process
model (described in Section 2.2) that reflects the transition matrix well. The MAP extracted
from a transition matrix should be verified in terms of two constraints: (i) fitness: any transition
between strategies possible in the transition matrix should also be possible within the MAP;
(ii) precision: any transition that is possible between strategies in the MAP should also be
possible in the transition matrix. Deneckère et al. [24] and Khodabandelou et al. [58, 59]
introduced recall and precision as metrics of fitness and precision, respectively.
The main assumption here is that such unsupervised approaches only consider the
transitions whose probability is above a given threshold ε. The value of ε is chosen heuristically
to obtain a suitable trade-off between the granularity of the MAP and its understandability.
Hence, ε is considered as an input of the algorithm in Fig. 6. The F-measure metric, a
combination of both recall and precision, is used to qualify the MAP. Deneckère et al. [24]
proposed an algorithm, called Deep Miner, that finds a MAP that maximizes F-measures with
the lowest number of sections (see Fig. 6). The output of the Deep Miner algorithm is a very
detailed map with several nodes, called fine-grained MAP. Fig. 8 shows a fine-grained MAP
extracted by the Deep Miner algorithm in an example with six strategies. The nodes generated
by the Deep Miner algorithm are called “sub-intentions” as they are connected to each other
with some strategies.

Strategies
I2 S6
I3
I1 S3 I4
Start Stop
S1 Sub I-2 Sub I-4
S3
S5
S1 S4
Sub I-1 S3 S6
S2 Sub I-5
Sub I-3
S2
S4

Sub Intentions Intentions

Fig. 8 A fine-grained Map extracted by Deep Miner algorithm includes six strategies [53].

22
Deneckère et al. [24] and Khodabandelou et al. [53, 57, 58, 59] also proposed a MAP Miner
algorithm to extract a high-level MAP from the fine-grained MAP produced by Deep Miner.
To this end, the MAP Miner algorithm uses a clustering algorithm, K-means [44], to group the
sub-intentions into some higher-level nodes (as clusters). In the clustering step, the attributes
of each given sub-intention are defined as binary values that represent the connectivity of that
sub-intention to the other nodes. Note that in the K-mean clustering algorithm, the number of
clusters (k) has to be chosen in advance. Accordingly, the parameter k, which is the number of
intentions in the final process model (coarse-grained MAP), is another input of MAP Miner.
This allows determining the level of abstraction for the final intentional MAP. The MAP Miner
algorithm groups the five sub-intentions of Fig. 8 into two high-level intentions (I2 and I3), as
shown in Fig. 9.

S6
I1 I4
S1 S3 S4
Start Stop

S6
S1 I2 I3

S2 S3 S5
S4

Fig. 9 A coarse-grained MAP generated from the fine-grained MAP in Fig. 8, using the MAP
Miner algorithm [53].

In the supervised approach, proposed by Deneckère et al. [24], the steps are similar to the ones
for the unsupervised approach, but the inputs are slightly different. In the supervised approach,
the sequences of strategies (not just their number) are also known. For example, in the model
shown in Fig. 7, the sequence of activities (e.g., <a,b,b,c,d>) and the corresponding sequence
of strategies (e.g. <s1,s1,s2,s2,s3>) are known.
In hidden Markov models, when the sequences of observations and their corresponding
states are known, estimation of emission and transition matrices is simple. This problem, called
HMM learning, is not one of the tree aforementioned problems of HMMs. Here, nothing is
really hidden. When the underlying states are known, a Maximum-Likelihood Estimation
approach estimates T and E to have the most likelihood of generating the sequence of
observations and the sequence of states simultaneously. In this approach (called counting
transitions method), Tij is the number of transitions from state i to state j divided by the number
of times that state i occurs. Also, Eik is the number of times that observation k is generated by
state i divided by the number of times that state i occurs.
As shown in Fig. 6, in a supervised approach, the BWA algorithm is not needed and the
counting transitions method can simply be applied to estimate HMM parameters. After this
point, all the steps are similar to the unsupervised approach and the MAP intentional model
will be finally extracted. It is worth noting that the number of clusters (k) in the MAP Miner
algorithm does not have to be chosen in a supervised approach as it equals the number of
strategies that are known [55].

4.2.6 Comparing the performance of supervised and unsupervised approaches


Khodabandelou et al. [56] compared supervised and unsupervised learning approaches of
HMMs in both a theoretical context (in terms of convergence speed and likelihood) and a
practical context (in terms of the intentional process models obtained). The results showed that

23
supervised learning ends up having a poor performance because it enforces binding conditions
in terms of data labeling, introduces inherent human biases, and delivers unreliable results in
the absence of ground truth. On its side, unsupervised learning gets efficient MAPs with a
better performance and lower human effort. Rabiner & Juang [84] also demonstrated that an
unsupervised learning approach to estimate HMM parameters offers a higher performance than
supervised learning.

4.2.7 Supervised intentional process model discovery using HMMs


Khodabandelou et al. [55] proposed an intention mining method called supervised intentional
process model discovery using HMMs. This approach is similar to the supervised MAP Miner
Method, described in Section 4.2.5, but here the states of the hidden Markov model are
intentions instead of strategies (see Fig. 10). This approach looks for some predefined
intentions behind the user activities traces and compares them to a prescribed intentional model
that is represented with MAP.

Intentions (States) ➔ It-1 It It+1

Activities (Observations) ➔ At-1 At At+1

Fig. 10 The HMM used in the method of Khodabandelou et al. [55]

In this method, there is a prior model called intentional process model. This model is a MAP
that consists of some strategies and some intentions. A human expert describes this model.
Users can select the alternative strategies to fulfill their intentions, following this intentional
model. Then, a modeler makes some groups of activities based on his/her inference to define
some strategies that are shown in the prescribed intentional model. Similar to the supervised
MAP Mining Method, this approach adopts hidden Markov models to discover a MAP from
observations. Consequently, regarding the characteristics of HMMs, a sequence of activities
(traces) of a given length could be generated by many sequences of intentions with the same
length. Here, the purpose is to find one sequence of intentions among all possible ones that is
the most likely sequence of intentions, which generates the given sequence of activities. This
problem is exactly the second key problem of hidden Markov models described in Section
4.2.4.
VA [34] is a well-known algorithm for finding the most likely sequence of hidden states
that generates a sequence of observed events in hidden Markov models. This algorithm needs
transition and emission matrices. When providing these inputs, the modeler estimates the
transition matrix by the simple counting approach described before (in the supervised MAP
mining approach). Once the parameters of the HMM are estimated, they can be used to find
the hidden intentions behind any sequence of actions traces (the sequence with the highest
probability of emergence), using VA. Then, this discovered sequences of intentions could be
compared to the prescribed sequences of intentions by a modeler. At this point, Deneckère et
al. [24] realized that this method is not accurate enough to represent a MAP model. They
argued that activities are indeed generated by strategies, not directly by intentions.
Accordingly, Deneckère et al. [24] proposed their supervised Map Miner Method, described
in Section 4.2.5, which models hidden states as strategies and observations as activities. Note

24
that the supervised Map Miner Method infers high-level intentions by clustering the sub-
intentions extracted from the Deep Miner algorithm (shown in Fig. 9).

4.2.8 Unsupervised intentional process model discovery using IntentMiner


Based on the intention mining approach proposed by Khodabandelou et al. [55], Epure et
al. [29] proposed an approach, called FlexPAISSeer, that focuses on the difficulties of
unexperienced actors when enacting flexible processes. An unexperienced process actor, who
is not much aware of the process, may have some difficulties to make a good decision about
the action to execute next under specific constraints. Addressing this problem, FlexPAISSeer
offers two components: IntentMiner and IntentRecommender. The first one discovers the
intentional model in an unsupervised manner, then the second one makes recommendations
based on the discovered intentional model and probabilistic calculus. In this unsupervised
approach, the clusters of events that are associated with an intention are identified as a basis
for the intentional model.
Epure et al. [29] aim to discover the intentional model through the traces of the process
participants. To this end, they mine both the intentions and the flows between them. They first
focus on the identification of the relevant data in a specific structure. Then, they follow the
IntentMiner algorithm and its six steps: extract the process participant current log, sort the log
by time, apply syntactic analysis, apply trend analysis, apply semantic analysis, and aggregate
intentions. While Khodabandelou et al. [57, 58] used BWA to find clusters of the strategies,
Epure et al. [29] proposed an algorithm to cluster the events in some classes and analyze them
as intentions. The main idea of their clustering is based on the similarity and confidence
between the consequent events that are sorted by time. The attributes of events are the main
basis for finding the correlation between them. Once the intentional clusters are determined,
the next step is the intention mining and naming by applying semantic analysis. Finally, the
algorithm aggregates the mined process instances in the intentional process model. The
intentional process model is further used by Intent-Recommender for providing up-to-date
recommendations to the process participants.
Epure et al. [29] evaluated IntentMiner in an experiment with 10 participants, interacting
with a Childcare system developed by a software company and used by several child day care
centers in The Netherlands. They compared the intentions classified by IntentMiner with the
real intention of participants. Following a confusion matrix classification, the average precision
was 69% and recall was 97% (IntentMiner mined the process participants’ intentions in 97%
of cases). Khodabandelou et al. [55] reported an average precision of 0.97 and average recall
of 0.93 for their supervised intention mining technique based on Hidden Markov Models. The
precision was much better given the fact the classifier had been trained earlier.

4.3 Key performance indicators (KPIs)


Based on the definition of Flapper et al. [32], performance is the degree to which an
organization accomplishes its objectives. All performance indicators (PIs) should be linked to
the organization’s goals [32]. The PIs that reflect the critical or “key” success factors of the
organization and are used to define and measure progress towards the organizational goals are
called KPIs [32, 49]. The fact that KPIs are known means of monitoring goals is an important
motivation to take KPIs into consideration in this literature review. Some goal modeling
languages such as GRL [50] and the Business Intelligence Model [47] even support indicators
as a core concept. The studies that focus on the use of event logs in monitoring the KPIs are
worthy of attention, especially if these KPIs can be connected to goals.
As discussed earlier, the analysis of misalignments between an event log and a prescribed
process model is a good source of improvement of requirements. Process mining techniques
(i.e. conformance checking and process enhancement) make it possible to diagnose such
25
deviations. The severity of each deviation can be quantified to form a basis for tackling the
problem of repairing a process model such that the enhanced model can replay the log (as a
history of actual behaviors) [30,103]. Since a process model may deviate from a log at an
arbitrary number of points, making such alignment may be very far from trivial. van der Aalst
et al. [103] have demonstrated characteristics, limitations and methods for conducting
conformance checking in the context of process mining. In this same context, Fahland and van
der Aalst [30] have investigated the problem of repairing a process model to replay the log and
conform to it.
Relying on [30, 103], Dees et al. [23] proposed a methodology that repairs a process model
with respect to the behaviors that do not violate any rule and have a significant improvement
on the performance level. Involving performance levels that correspond to specific business
goals or requirements, represented by KPIs, positions their work as a goal-oriented process
mining activity. The basic inputs of the proposed methodology are an event log together with
a process model (either discovered or designed manually) represented by a Petri net. The output
is a repaired and improved process model ready to impose a better way of performing the
business process.
In a nutshell, Dees et al.’s methodology is implemented in three main steps. First, a
deviation analysis is performed to determine which deviations have a positive impact on the
process performance. To this end, deviations are correlated to a selected KPI stored as a field
in event log. Then a set of rules is discovered through a decision tree classification method,
where the KPI is a label field. Then for each rule, the corresponding class contains all traces
that comply with the rule. In the second step, traces of different classes are repaired to only
keep those deviations having a positive impact on the KPI. Then all the logs kept in all the
classes are merged to obtain a single repaired event log. Finally, the repaired log is used as a
basis to repair the process model. To this end, the process model is revised in such a way that
the resulting model is able to replay all the behavior of the repaired event log. In particular,
this technique will modify the model only to make the desired deviating behaviors possible.
Here, the desired deviating behaviors can be considered as those facts that can reflect some
neglected requirements.
Dees et al. [23] assessed the feasibility of their proposed methodology through two case
studies. The result showed that the improved process model still adheres to the desired
prescriptive model and guarantees enhanced KPI levels.
While Dees et al. [23] took advantage of performance indicators to improve business
processes, Cho et al. [17] aimed to assess whether process redesign have been applied and to
what extent. They proposed an assessment framework, where the focus is on the process
redesign lifecycle phase, which is tightly coupled with process mining as an operational
framework to calculate identified indicators.
The framework consists of some indicators to assess whether best practices have been
applied. Best practice implementation indicators (BPIs) and process performance indicators
(PPIs) are two sets of indicators that assess process improvement resulting from the application
of best practices. It is possible to calculate both sets using standard process mining
functionalities. Cho et al. also define what data should be recorded during process execution
to enable such a calculation. The proposed framework was evaluated over case studies in a
hospital and a tour agency and compared with other approaches in the literature.
Cho et al. [17], defined 17 BPIs for 29 best practices that were previously proposed by
Reijers and Mansar [86], such as customer reduction and customer integration. They also
suggested process mining techniques appropriate for computing each indicator. In the case
where the BPI information does not exist in the event log, they suggested which supplementary
information is needed to assess the implementation of redesigns. In addition, 13 PPIs are
suggested to assess the effect of best practices on process performance. These PPIs are based

26
on four process performance measures explained by Reijers and Mansar [86]: Time, Cost,
Quality and Flexibility. Cho et al. gave a detailed explanation on each PPIs including how to
measure them using event logs. Most business process redesign efforts are invested in
increasing the efficiency of business processes through improving time-related indicators, e.g.,
decreasing processing time and waiting time. Calculating time-related PPIs (5 out of their 13
PPIs) using event logs is much easier than calculating quality-related indicators (4 out of their
13 PPIs) because they are not clearly stated in the logs. The latter are rather evaluated by
checking the satisfaction of customers, for instance via surveys [43].
Krathu et al. [64] used a bottom-up approach for the identification of KPIs from event logs,
business information, and process models. This approach, together with a top-down approach
for measuring business performance on the strategic level (Balanced Scorecard, BSC), make a
framework that supports inter-organizational performance evaluation. This framework,
proposed mainly for evaluation of inter-organizational relationships, is developed as a plug-in
of ProM 6, namely the BSC EDImine plug-in [26].
Knowledge about the BSC method is essential to understand the structure of this
framework. In terms of the bottom-up approach that uses event logs for identification of KPIs,
they follow two major steps: i) frequency analysis and ii) consideration of KPIs based on the
semantics of data elements and message types. This method has been described in detail by
Krathu et al. [63]. It is worth noting that these two papers [63, 64] represented the EDImine
plug-in, specifically designed for supporting the business activities and transactions within a
network of organizations called Electronic Data Interchange (EDI).
The framework proposed by Krathu et al. [63, 64] was validated with a case study using
data from a manufacturing company. This case study showed that the company can evaluate
the inter-organizational performance quantitatively against their business objectives. The KPIs
derived from this bottom-up process mining techniques together with other statistical
techniques enabled the company to evaluate their business performance more accurately.

5 Reflective discussion
Synthesizing what we found from the review of the selected papers enables us to reflect on the
state of the art and on potential research gaps.
Results of the queries on both process mining and goal modeling (Section 3) yield that
potentially synergetic effects achievable by combining goal-oriented modeling and process
mining have so far been neglected. Filling this gap can help practitioners of both domains deal
with some of their own challenges that are difficult to mitigate separately [37, 38].
The review exposed three main categories of studies: goal modeling and requirements
elicitation, intention mining, and performance indicators. The goal modeling category
(Section 4.1) focuses on the use of process mining approaches in associated with process goals.
These studies have been done separately and do not represent a coherent line of research in
this context. Nevertheless, they expose a broad area of research at the intersection of the two
topics from different viewpoints.
• From an agent viewpoint, the goals behind activities of agents who contribute in a
process (e.g., employees) are considered using three different approaches. Yan et
al. [115] proposed an approach to learn goals of agents by classifying their activities in
different situations, adopting a decision tree algorithm. Armentano and Amandi [7]
considered recognizing goals that motivate agents to do their tasks using variable-order
Markov models. Finally, Outmazgin and Soffer [75] analyzed different kinds of
intentional incompliances where employees intentionally deviate from prescribed
procedures (work-arounds) to find their sources using process discovery techniques.

27
• A process viewpoint (or case/customer viewpoint) that considers all activities
constituting a trace is considered in the first category of papers as well. Ponnalagu et al.
[81] proposed an approach to analyze and validate a family of variants of a single process
based on a goal model. Horita et al. [45] proposed a method to detect discrepancies
between logs and prescribed models and to analyze their effects using a goal-oriented
conformance checking approach. Bernard and Andritsos [11] used process discovery in
conjunction with customers’ journeys and developed a tool to facilitate navigation
through many different such journeys in a goal-oriented fashion [11]. Xu et al. [114]
used LDA together with process discovery to generate a topic-based clinical pathway,
beyond activity-based ones, appropriate as a representation of clinical goals. The
adaptability of discovered processes to answer new requests made on business processes
was considered by Baek et al. [8]. The above took advantage of some concepts that
primarily belong to the α-algorithm [66], such as footprint matrices.
• From an organization viewpoint, the overall goals that should be achieved by execution
of the business processes are considered. Santiputri et al. [92] propose an approach to
discover goal refinement patterns of the goal models considering the sequence of events
in multi-layered event logs.

Taking advantage of process mining capabilities is promising to enrich requirement elicitation


and management. Dabrowski et al. [22] considered eliciting new requirements using process
mining techniques along with crowdsourcing from logs of interaction of end-users with a piece
of software or application. Using event logs to find requirements, however, deserves to be more
developed not only for software specifications but also for activities performed within
organizational and social contexts [38].
In contrast with the first category, the second category of papers (Section 4.2) shows a
meaningful trace of linked studies on intention mining. Khodabandelou et al. [54] introduced
intention mining in comparison with process mining. The Map Miner Method can infer
intentional MAP models through either a supervised [24] or an unsupervised [57, 58, 59]
approach. These two approaches were compared in theoretical and practical contexts by
Khodabandelou et al. [56] and Rabiner & Juang [84]. Based on the supervised intentional
process model discovery [55], Epure et al. [29] proposed an approach that aims to address the
difficulties of unexperienced actors to make a good decision about the next action they should
do under specific constraints. All the papers from this second category have been published
after 2013 and hence represent an emerging field that focuses on the intentions underlying
activities rather than on the activities themselves.
The third category (Section 4.3) consists of papers focusing on performance indicators.
There are four papers in this category that linked the event logs to indicators. Two papers
proposed a framework specifically for supporting transactions within a network of EDI
organizations [63, 64]. The other two papers use indicators to modify models according to
desired deviating behaviors and to assess impacts of process improvements. In the literature
on goal modeling, the main use of indicators is to measure the degree of fulfillment of
stakeholders’ goals. Indicators are essentially needed to evaluate the goals that constitute a
goal model [47]. However, this aspect of KPIs is not considered by the papers selected in this
third category. Even recent work from Gurgen Erdogan and Tarhan [41] acknowledges the
need to consider indicators linked to goals for effective process mining in healthcare, but only
do so informally in their approach, and without goal models. Accordingly, the research about
goal-oriented process mining associated with performance indicators, as means to
quantitatively measure goal satisfaction, requires further attention.
This systematic literature review is of value to the many researchers working either on
goal/requirements modeling or on process mining, as it exposes knowledge about concepts and
28
state-of-the-art techniques that cut across these two siloed research domains. It is our hope that
the knowledge about both domains presented here will help researchers consider the potential
synergy between goal modeling and process mining. Moreover, this review is valuable to
requirements engineers interested in obtaining evidence-based definitions of existing
processes, their goals, and indicators. This enables them to measure goal satisfaction to assess
goal-process alignment and help define relevant requirements for their modification or
improvement. Practitioners can further benefit from this paper as it raises awareness about the
limitations of current activity-based process modeling and mining approaches and about
emerging techniques that attempt to discover goals and intentions behind processes.
Note that three aforementioned categories are not entirely mature yet. Accordingly,
researchers who are interested in mining goals from event logs (goal mining) or discovering
goal-aligned processes (goal-oriented process mining) are invited to take into consideration
process mining techniques combined with the abilities of the goal modeling approaches.
Requirements engineers who practice scenario-based elicitation methods can exploit
process mining capabilities to generalize instance-level data logs into models. This allows them
to take advantage of such data, which is orders of magnitude larger than the data engineers
often use.

6 Threats to validity
The objective of a systematic review is to reveal as many primary related studies as possible
to summarize all information available about some phenomenon in a systematic and unbiased
manner. There exist several threats to the validity of this particular research and of the papers
that were reviewed in the previous section. The validity of a research is concerned with the
question about the accuracy of conclusions, i.e. the alignment between research conclusions
and reality [77].
This section takes into consideration the threats and the biases that may affect the validity
of this work and of the 24 papers selected for review. The main threats are discussed according
to two categories: construct validity and internal validity [31].

6.1 Construct validity


Construct validity refers to the quality of the methodology in terms of being helpful to answer
the target research questions. In this paper, like in other literature reviews, searching for and
selecting relevant papers played a crucial role. Note that the research questions were answered
based on the papers selected from the set found in the searching stage. Hence, any mistake,
weakness, and imprecision in the first stage could threaten the accuracy of the answers.
Even though the most frequent and important keywords were used in the search queries, it
is possible that some relevant papers that have not used those keywords have not been found.
Another threat is related to the search engine limitations. Ghasemi and Amyot [39]
suggested that 22% of the papers related to process mining are returned only by Google
Scholar. This shows that Google Scholar has a notable contribution of relevant references in
the context of process mining. Also, as mentioned in Section 3.3, one set of the queries’
keywords are related to process mining. On the other hand, as Google Scholar only searches
within title or full-text, the authors chose to search in titles only (to avoid noise). Consequently,
the papers whose titles do not have the queries’ keywords were neglected by Google Scholar.
Moreover, all queries’ keywords were defined in English. Therefore, another threat that
may affect the validity of this review is the possibility of overlooking some relevant non-
English papers.

29
Finally, taking advantage of four search engines, including two generic ones (Google
Scholar and Scopus), allows to detect a large proportion of the papers stored in scientific
databases. Nevertheless, there might be some relevant papers stored in other databases that are
not reachable with these search engines.

6.2 Internal validity


Some other biases and confounding factors can threaten the validity of this study and of the
other studies that were reviewed. For example, almost all of the papers related to intention
mining came from the same group of researchers. This fact could make an environment open
for some biases.
In this paper, one additional threat is that the selection and analysis of papers and the
application of the criteria was mostly done by one person (the first author), with a confirmation
from the second author. Such issue could be mitigated in the future by having more people do
such selection/analysis independently, and then use a consensus-building mechanism.

7 Conclusion
This paper first provided an overview of process mining and of goal modeling. Following the
systematic literature review approach, the authors ran queries using four relevant search
engines and analyzed the returned papers in order to answer the research questions.
It was first ensured that there was no literature review specifically related to goal-oriented
process mining. Although there are reviews slightly related to this topic, including the one of
Papadimitriou et al. [76], none of them has focused on mining goals or requirements from
event logs and/or exploiting goal models through process mining activities as it would be
expected in a goal-oriented process mining approach. Furthermore, several papers related to
goal-oriented process mining were found by our queries and reviewed, but none of them was
a literature review.
Regarding question RQ1, general queries with keywords related to both research fields,
process mining and goal modeling, were defined and run. Through their results, we found that
although process mining and goal modeling are growing research topics with considerable
amounts of publications, there are only a few rare studies conducted at their intersection.
Therefore, this suggests that goal mining and requirements engineering techniques that exploit
data logs, and goal-oriented processes discovery that exploit goals, can still be considered gaps
to be filled between the process mining and requirements engineering communities.
Question RQ2 was about the definition of the main categories of existing approaches, and
their main contributions to goal-oriented process mining. In order to answer this question, the
studies conducted on the joint goal modeling and process mining techniques were obtained and
summarized along three main categories: goal modeling and requirements elicitation, intention
mining, and performance indicators. Then, we synthetized them to reflect on what has been
done in each category. The synthesis does not suggest a coherent line of research about process
mining in association with goals. However, a meaningful trace of research work was suggested
for intention mining as the second category. Moreover, we observed that research about
performance indicators (as measures of goal satisfaction) associated with process mining is
also sparse.
Goal-oriented process mining promises to make a beneficial connection from event logs,
as a major part of the process mining concept, to goals and requirements, as essential concepts
of most requirements engineering approaches. This connection opens up new horizons in the
context of business process modeling and data analysis to mine, monitor, and maintain process
goals, process performance, and satisfaction levels of stakeholders. To this end, URN, as a

30
standard notation that supports activity-based process models, goal models, and KPIs, is an
interesting candidate modeling notation that we would like to explore further in the future. This
research direction, which would fill the goal-oriented process mining gap highlighted in this
paper, could lead to developments that would help practitioners understand and improve
processes [37, 41], manage requirements [38], and support organizations in better measuring
and achieving their goals. Another research direction is concerned with the mining of
declarative process models (rather than procedural ones), which can also be aligned with
intentions and goals [96].

References
1 Abraham, J., & Reddy, M. C. (2010). Challenges to inter-departmental coordination of patient transfers: a
workflow perspective. International Journal of Medical Informatics, 79(2), 112–122.
doi:10.1016/j.ijmedinf.2009.11.001
2 Akhigbe, O., Amyot, D., Anda, A. A., Lessard, L., & Xiao, D. (2016). Consistency Analysis for User
Requirements Notation Models. In iStar 2016 - Ninth International i* Workshop, CEUR-WS Vol-1674 (pp. 43–
48).
3 Aldin, L., & de Cesare, S. (2011). A literature review on business process modelling: new frontiers of
reusability. Enterprise Information Systems, 5(3), 359–383. doi:10.1080/17517575.2011.557443
4 Amyot, D., Ghanavati, S., Horkoff, J., Mussbacher, G., Peyton, L., & Yu, E. (2010). Evaluating goal models
within the Goal‐oriented Requirement Language. International Journal of Intelligent Systems, 25(8), 841–877.
doi:10.1002/int.20433
5 Amyot, D., Horkoff, J., Gross, D., & Mussbacher, G. (2009). A lightweight GRL profile for i* modeling. In
Advances in Conceptual Modeling - Challenging Perspectives, LNCS 5833 (pp. 254–364). Springer.
doi:10.1007/978-3-642-04947-7
6 Amyot, D., & Mussbacher, G. (2011). User Requirements Notation: The First Ten Years, The Next Ten Years.
Journal of Software, 6(5), 747–768. http://www.jsoftware.us/vol6/jsw0605-1.pdf
7 Armentano, M. G., & Amandi, A. A. (2012). Towards a goal recognition model for the organizational memory.
In Computational Science and Its Applications – ICCSA 2012, LNCS 7335 (pp. 730–742). Springer.
doi:10.1007/978-3-642-31137-6
8 Baek, S.-J., Ko, J.-W., Kim, G.-J., Han, J.-S., & Song, Y.-J. (2012). Goal-Heuristic Analysis Method for an
Adaptive Process Mining. In Proceedings of the International Conference on IT Convergence and Security 2011,
LNEE 120 (pp. 409–418). Springer. doi:10.1007/978-94-007-2911-7_37
9 Ballou, R. H., Gilbert, S. M., & Mukherjee, A. (2000). New managerial challenges from supply chain
opportunities. Industrial Marketing Management, 29(1), 7–18. doi:10.1016/S0019-8501(99)00107-8
10 Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical
analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41(1), 164–171.
doi:10.1214/aoms/1177697196
11 Bernard, G., & Andritsos, P. (2017). CJM-ex: Goal-oriented exploration of customer journey maps using event
logs and data analytics. In BPM Demo Track and BPM Dissertation Award (BPM-D&DA 2017), CEUR-WS
Vol-1920 (paper 172).
12 Bernard, G., & Andritsos, P. (2017). A process mining based model for customer journey mapping. In
Proceedings of the Forum and Doctoral Consortium Papers Presented at the 29th International Conference on
Advanced Information Systems Engineering (CAiSE 2017). CEUR-WS Vol-1848 (pp. 46–56).
13 Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning
Research, 3(Jan), 993–1022.
14 Bresciani, P., Perini, A., Giorgini, P., Giunchiglia, F., & Mylopoulos, J. (2004). Tropos: An Agent-Oriented
Software Development Methodology. Autonomous Agents and Multi-Agent Systems, 8(3), 203–236.
doi:10.1023/B:AGNT.0000018806.20944.ef

31
15 Buijs, J. C., van Dongen, B., & van der Aalst, W. M. P. (2013). Mining configurable process models from
collections of event logs. In Business Process Management 2013, LNCS 8094 (pp. 33–48). Springer.
doi:10.1007/978-3-642-40176-3_5
16 Cailliau, A., & van Lamsweerde, A. (2014). Integrating exception handling in goal models. In Requirements
Engineering Conference (RE), 2014 IEEE 22nd International (pp. 43–52). IEEE CS.
doi:10.1109/RE.2014.6912246
17 Cho, M., Song, M., Comuzzi, M., & Yoo, S. (2017). Evaluating the effect of best practices for business process
redesign: An evidence-based approach based on process mining techniques. Decision Support Systems, 104, 92–
103. doi:10.1016/j.dss.2017.10.004
18 Chung, L., Nixon, B. A., Yu, E., & Mylopoulos, J. (2012). Non-functional requirements in software
engineering. International Series in Software Engineering (Vol. 5). Springer. doi:10.1007/978-1-4615-5269-7
19 Cook, J. E., & Wolf, A. L. (1995). Process discovery and validation through event-data analysis. In ICSE’95
- Proc. of the 17th International Conference on Software Engineering (pp. 73–82). ACM.
20 Cook, J. E., & Wolf, A. L. (1998). Discovering models of software processes from event-based data. ACM
Transactions on Software Engineering and Methodology, 7(3), 215–249. doi:10.1145/287000.287001
21 Dąbrowski, J. (2017). Towards an adaptive framework for goal-oriented strategic decision-making. In
Requirements Engineering Conference (RE), 2017 IEEE 25th International (pp. 538–543). IEEE.
doi:10.1109/re.2017.53
22 Dabrowski, J., Kifetew, F. M., Muñante, D., Letier, E., Siena, A., & Susi, A. (2017). Discovering Requirements
through Goal-Driven Process Mining. In 2017 IEEE 25th International Requirements Engineering Conference
Workshops (REW) (pp. 199-203). IEEE. doi:10.1109/rew.2017.61
23 Dees, M., de Leoni, M., & Mannhardt, F. (2017). Enhancing Process Models to Improve Business
Performance: A Methodology and Case Studies. In OTM Confederated International Conferences" On the Move
to Meaningful Internet Systems" (pp. 232-251). Springer, Cham. doi:10.1007/978-3-319-69462-7_15
24 Deneckère, R., Hug, C., Khodabandelou, G., & Salinesi, C. (2014). Intentional Process Mining: Discovering
and Modeling the Goals Behind Processes using Supervised Learning. International Journal of Information
System Modeling and Design (IJISMD), 5(4), 22–47. doi:10.4018/ijismd.2014100102
25 Dowson, M. (1987). Iteration in the software process; review of the 3rd International Software Process
Workshop. In ICSE’87 - Proceedings of the 9th International Conference on Software Engineering (pp. 36–41).
IEEE CS.
26 EDImine (2011) EDImine - Mining Inter-organizational Business Processes. Retrieved from
http://edimine.ec.tuwien.ac.at/
27 El Kharbili, M., de Medeiros, A. K. A., Stein, S., & van der Aalst, W. M. (2008). Business Process Compliance
Checking: Current State and Future Challenges. In Modellierung betrieblicher Informationssysteme (MobIS
2018), LNI 141 (pp. 107–113). GI-Edition.
28 Epure, E. V, Hug, C., Deneckère, R., & Brinkkemper, S. (2013). Intention-mining: A solution to process
participant support in process aware information systems. Technical Report 2013-020, Department of Information
and Computing Sciences, Utrecht University, The Netherlands.
29 Epure, E. V, Hug, C., Deneckère, R., & Brinkkemper, S. (2014). What shall I do next? Intention mining for
flexible process enactment. In Advanced Information Systems Engineering, LNCS 8484 (pp. 473–487). Springer.
doi:10.1007/978-3-319-07881-6
30 Fahland, D., & van der Aalst, W. M. P. (2015). Model repair—aligning process models to reality. Information
Systems, 47, 220–243. doi:10.1016/j.is.2013.12.007
31 Feldt, R., & Magazinius, A. (2010). Validity Threats in Empirical Software Engineering Research-An Initial
Survey. In SEKE 2010 - Proceedings of the 22nd International Conference on Software Engineering and
Knowledge Engineering (pp. 374–379). KSI Research Inc.
32 Flapper, S. D. P., Fortuin, L., & Stoop, P. P. M. (1996). Towards consistent performance management systems.
International Journal of Operations & Production Management, 16(7), 27–37. doi:10.1108/01443579610119144
33 Fluxicon (2016). Disco. Retrieved from fluxicon.com/disco/

32
34 Forney, G. D. (1973). The viterbi algorithm. Proceedings of the IEEE, 61(3), 268–278.
doi:10.1109/PROC.1973.9030
35 Georgeff, M., & Rao, A. (1998). Rational software agents: from theory to practice. In Agent technology -
Foundations, Applications, and Markets (pp. 139–160). Springer. doi:10.1007/978-3-662-03678-5_8
36 Ghanavati, S., Amyot, D., & Peyton, L. (2011). A systematic review of goal-oriented requirements
management frameworks for business process compliance. In Requirements Engineering and Law (RELAW),
2011 Fourth International Workshop on (pp. 25–34). IEEE CS. doi: 10.1109/relaw.2011.6050270
37 Ghasemi, M. (2018). Towards Goal-Oriented Process Mining. In Requirements Engineering Conference (RE),
IEEE 26th International (pp. 484–489). IEEE CS. doi:10.1109/RE.2018.00066
38 Ghasemi, M. (2018). What Requirements Engineering can Learn from Process Mining. In 1st International
Workshop on Learning from other Disciplines for Requirements Engineering (D4RE) (pp. 8–11), IEEE CS.
doi:10.1109/D4RE.2018.00008
39 Ghasemi, M., & Amyot, D. (2016). Process mining in healthcare: a systematised literature review. International
Journal of Electronic Healthcare, 9(1), 60–88. doi:10.1504/IJEH.2016.078745
40 Groen, E.C., Seyff, N., Ali, R., Dalpiaz, F., Doerr, J., Guzman, E., Hosseini, M., Marco, J., Oriol, M., Perini,
A. & Stade, M., (2017). The crowd in requirements engineering: The landscape and challenges. IEEE software,
34(2), 44–52. doi:10.1109/ms.2017.33
41 Gurgen Erdogan, T., & Tarhan, A. (2018) A Goal-Driven Evaluation Method Based on Process Mining for
Healthcare Processes. Applied Sciences, 8(6), 894. doi:10.3390/app8060894
42 Hallerbach, A., Bauer, T., & Reichert, M. (2010). Capturing variability in business process models: the Provop
approach. Journal of Software Maintenance and Evolution: Research and Practice, 22(6–7), 519–546.
doi:10.1002/smr.v22:6/7
43 Hammer, M., & Champy, J. (1993). Reengineering the corporation: a manifesto for business revolution.
Zondervan. doi:10.1016/s0007-6813(05)80064-3
44 Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the
Royal Statistical Society. Series C (Applied Statistics), 28(1), 100–108. doi:10.2307/2346830
45 Horita, H., Hirayama, H., Tahara, Y., & Ohsuga, A. (2015). Towards Goal-Oriented Conformance Checking.
In 27th Int. Conf. on Software Engineering and Knowledge Engineering (SEKE 2015) (pp. 722–724). KSI
Research Inc.
46 Horkoff, J., Aydemir, F.B., Cardoso, E., Li, T., Maté, A., Paja, E., Salnitri, M., Piras, L., Mylopoulos, J. &
Giorgini, P., (2017). Goal-oriented requirements engineering: an extended systematic mapping
study. Requirements Engineering, 1–28. doi:10.1007/s00766-017-0280-z
47 Horkoff, J., Barone, D., Jiang, L., Yu, E., Amyot, D., Borgida, A., & Mylopoulos, J. (2014). Strategic business
modeling: representation and reasoning. Software & Systems Modeling, 13(3), 1015–1041. doi:10.1007/s10270-
012-0290-8
48 Horkoff, J., Li, T., Li, F.-L., Salnitri, M., Cardoso, E., Giorgini, P., & Mylopoulos, J. (2015). Using Goal
Models Downstream: A Systematic Roadmap and Literature Review. International Journal of Information System
Modeling and Design (IJISMD), 6(2), 1–42. doi:10.4018/IJISMD.2015040101
49 Hornix, P. (2007). Performance analysis of business processes through process mining. Master’s thesis,
Eindhoven University of Technology, The Netherlands.
50 ITU-T (2012). Recommendation Z.151 (10/12): User Requirements Notation (URN) - Language Definition.
http://www.itu.int/rec/T-REC-Z.151/en
51 Johann, T., & Maalej, W. (2015). Democratic mass participation of users in requirements engineering?. In
Requirements Engineering Conference (RE), 2015 IEEE 23rd International (pp. 256–261). IEEE CS. doi:
10.1109/RE.2015.7320433
52 Khodabandelou, G. (2013). Contextual recommendations using intention mining on process traces: Doctoral
consortium paper. In 2013 IEEE Seventh International Conference on Research Challenges in Information
Science (RCIS) (pp. 1–6). IEEE CS. doi:10.1109/RCIS.2013.6577728
53 Khodabandelou, G. (2014). Mining Intentional Process Models. Doctoral dissertation, University of Paris-Est,
France.

33
54 Khodabandelou, G., Hug, C., Deneckère, R., & Salinesi, C. (2013a). Process mining versus intention mining.
In Enterprise, Business-Process and Information Systems Modeling, LNBIP 147 (pp. 466–480). Springer.
doi:10.1007/978-3-642-38484-4
55 Khodabandelou, G., Hug, C., Deneckère, R., & Salinesi, C. (2013b). Supervised intentional process models
discovery using Hidden Markov models. In 2013 IEEE Seventh International Conference on Research Challenges
in Information Science (RCIS) (pp. 1–11). IEEE CS. doi:10.1109/RCIS.2013.6577711
56 Khodabandelou, G., Hug, C., Deneckère, R., & Salinesi, C. (2014a). Supervised vs. Unsupervised Learning
for Intentional Process Model Discovery. In Enteprise, Business-Process and Information Systems Modeling,
LNBIP 175 (pp. 215–229). Springer. doi:10.1007/978-3-662-43745-2
57 Khodabandelou, G., Hug, C., Deneckère, R., & Salinesi, C. (2014b). Unsupervised discovery of intentional
process models from event logs. In Proceedings of the 11th Working Conference on Mining Software Repositories
- MSR 2014 (pp. 282–291). ACM Press. doi:10.1145/2597073.2597101
58 Khodabandelou, G., Hug, C., & Salinesi, C. (2014). A novel approach to process mining: Intentional process
models discovery. In 2014 IEEE Eighth International Conference on Research Challenges in Information Science
(RCIS) (pp. 1–12). IEEE CS. doi:10.1109/RCIS.2014.6861040
59 Khodabandelou, G., Hug, C., & Salinesi, C. (2015). Mining Users’ Intents from Logs. International Journal of
Information System Modeling and Design (IJISMD), 6(2), 43–71. doi:10.4018/IJISMD.2015040102
60 Kingston, J., Schafer, B., & Vandenberghe, W. (2004). Towards a financial fraud ontology: A legal modelling
approach. Artificial Intelligence and Law, 12(4), 419–446. doi:10.1007/s10506-005-4163-0
61 Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic
literature reviews in software engineering - A systematic literature review. Information and Software Technology,
51(1), 7–15. doi:10.1016/j.infsof.2008.09.009
62 Kraiem, N., Kaffela, H., Dimassi, J., & Al Khanjari, Z. (2014). Mapping from MAP models to BPMN
processes. Journal of Software Engineering, 8(4), 252–264. doi:10.3923/jse.2014.252.264
63 Krathu, W., Engel, R., Pichler, C., Zapletal, M., & Werthner, H. (2013). Identifying inter-organizational key
performance indicators from EDIFACT messages. In 2013 IEEE 15th Conference on Business Informatics (pp.
276–283). IEEE CS. doi:10.1109/CBI.2013.46
64 Krathu, W., Pichler, C., Engel, R., Zapletal, M., Werthner, H., & Huemer, C. (2014). A Framework for Inter-
Organizational Performance Analysis from EDI Messages. In 2014 IEEE 16th Conference on Business
Informatics (Vol. 1, pp. 17–24). IEEE CS. doi:10.1109/CBI.2014.19
65 Lapouchnian, A., Yu, Y., & Mylopoulos, J. (2007). Requirements-driven design and configuration
management of business processes. In Business Process Management, LNCS 4714 (pp. 246–261). Springer.
doi:10.1007/978-3-540-75183-0_18
66 Li, J., Liu, D., & Yang, B. (2007). Process Mining: Extending α-Algorithm to Mine Duplicate Tasks in Process
Logs. In Advances in Web and Network Technologies, and Information Management, LNCS 4537 (pp. 396–407).
Springer. doi:10.1007/978-3-540-72909-9_43
67 Liu, L., & Yu E. S. K. (2004). Designing information systems in social context: a goal and scenario modelling
approach. Information Systems, 29(2), 187–203. doi:10.1016/S0306-4379(03)00052-8
68 Ly, L. T., Maggi, F. M., Montali, M., Rinderle-Ma, S., & van der Aalst, W. M. (2015). Compliance monitoring
in business processes: Functionalities, application, and tool-support. Information systems, 54, 209–234.
doi:10.1016/j.is.2015.02.007
69 Maalej, W., Nayebi, M., Johann, T., & Ruhe, G. (2016). Toward data-driven requirements engineering. IEEE
Software, 33(1), 48–54. doi:10.1109/ms.2015.153
70 Malinova, M., Dijkman, R., & Mendling, J. (2013). Automatic extraction of process categories from process
model collections. In Business Process Management Workshops, LNBIP 171 (pp. 430–441). Springer.
doi:10.1007/978-3-319-06257-0_34
71 Maxwell, J. C., Antón, A. I., Swire, P., Riaz, M., & McCraw, C. M. (2012). A legal cross-references taxonomy
for reasoning about compliance requirements. Requirements Engineering, 17(2), 99–115. doi:10.1007/s00766-
012-0152-5

34
72 Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector
space. arXiv preprint arXiv:1301.3781.
73 Object Management Group. (2011). Business Process Model and Notation (BPMN), Version 2.0.
Formal/2011-01-03.
74 Outmazgin, N. (2012). Exploring workaround situations in business processes. In International Conference on
Business Process Management (pp. 426–437). Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-36285-9_45
75 Outmazgin, N., & Soffer, P. (2016). A process mining-based analysis of business process work-arounds.
Software & Systems Modeling, 15(2), 309–323. doi:10.1007/s10270-014-0420-6
76 Papadimitriou, D. Koutrika, G., Mylopoulos, J., & Velegrakis, Y. (2016). The Goal Behind the Action: Toward
Goal-Aware Systems and Applications. ACM Trans. Database Syst., 41(4), Article 23. doi: 10.1145/2934666
77 Pitman, M. A. (1998). Qualitative Research Design: An Interactive Approach. Anthropology & Education
Quarterly, 29(4), 499–501. doi:10.1525/aeq.1998.29.4.499
78 Poelmans, S. (1998). Coping strategies and distributed viscosity in a workflow management system: a case
study. In Workshop on Adaptive Workflow Systems, Seattle, USA, November (8 pages).
79 Poelmans, S. (1999). Workarounds and distributed viscosity in a workflow system: a case study. ACM
SIGGROUP Bulletin, 20(3), 11-12. doi:10.1145/605610.605618
80 Poels, G., Decreus, K., Roelens, B., & Snoeck, M. (2013). Investigating Goal-Oriented Requirements
Engineering for Business Processes. Journal of Database Management, 24(2), 35–71.
doi:10.4018/jdm.2013040103
81 Ponnalagu, K., Ghose, A., Narendra, N. C., & Dam, H. K. (2015). Goal-aligned categorization of instance
variants in knowledge-intensive processes. In Business Process Management, LNCS 9253 (pp. 350–364).
Springer. doi:10.1007/978-3-319-23063-4
82 Pourshahid, A., Amyot, D., Peyton, L., Ghanavati, S., Chen, P., Weiss, M., & Forster, A. J. (2009). Business
process management with the user requirements notation. Electronic Commerce Research, 9(4), 269–316.
doi:10.1007/s10660-009-9039-z
83 ProM Tools (2016). Retrieved from http://www.promtools.org/doku.php
84 Rabiner, L., & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–
16. doi:10.1109/MASSP.1986.1165342
85 Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition.
Proceedings of the IEEE, 77(2), 257–286. doi:10.1109/5.18626
86 Reijers, H. A., & Mansar, S. L. (2005). Best practices in business process redesign: an overview and qualitative
evaluation of successful redesign heuristics. Omega, 33(4), 283–306. doi:10.1016/j.omega.2004.04.012
87 Rolland, C., Prakash, N., & Benjamen, A. (1999). A Multi-Model View of Process Modelling. Requirements
Engineering, 4(4), 169–187. doi:10.1007/s007660050018.
88 Rolland, C., & Salinesi, C. (2005). Modeling goals and reasoning with them. Engineering and Managing
Software Requirements (pp. 189-217). Springer Berlin Heidelberg. doi:10.1007/3-540-28244-0_9
89 Rozinat, A., & van der Aalst, W. M. P. (2008). Conformance checking of processes based on monitoring real
behavior. Information Systems, 33(1), 64–95. doi:10.1016/j.is.2007.07.001
90 Rubin, V. A., Mitsyuk, A. A., Lomazova, I. A., & van der Aalst, W. M. (2014). Process mining can be applied
to software too! In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software
Engineering and Measurement (p. 57). ACM. doi:10.1145/2652524.2652583
91 Saiedian, H., Kumarakulasingam, P., & Anan, M. (2005). Scenario-based requirements analysis techniques
for real-time software systems: a comparative evaluation. Requirements Engineering, 10(1), 22–33. doi:
10.1007/s00766-004-0192-6
92 Santiputri, M., Deb, N., Khan, M. A., Ghose, A., Dam, H., & Chaki, N. (2017). Mining Goal Refinement
Patterns: Distilling Know-How from Data. In International Conference on Conceptual Modeling (pp. 69–76).
Springer, Cham. doi:10.1007/978-3-319-69904-2_6

35
93 Snijders, R., Dalpiaz, F., Hosseini, M., Shahri, A., & Ali, R. (2014). Crowd-centric requirements engineering.
In Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (pp. 614–
615). IEEE Computer Society. doi:10.1109/ucc.2014.96
94 Soffer, P. (2013). A State-Based Intention Driven Declarative Process Model. International Journal of
Information System Modeling and Design (IJISMD), 4(2), 44–64. doi:10.4018/jismd.2013040103
95 Spoletini, P., & Ferrari, A. (2017). Requirements Elicitation: A Look at the Future Through the Lenses of the
Past. In Requirements Engineering Conference (RE), 2017 IEEE 25th International (pp. 476–477). IEEE CS.
doi:10.1109/RE.2017.35
96 Sutcliffe, A. (2003). Scenario-based requirements engineering. In Requirements Engineering Conference, 11th
IEEE International (pp. 320–329). IEEE CS. doi:10.1109/ICRE.2003.1232776
97 Sutcliffe, A. G., Maiden, N. A., Minocha, S., & Manuel, D. (1998). Supporting scenario-based requirements
engineering. IEEE Transactions on Software Engineering, 24(12), 1072–1088. doi:10.1109/32.738340
98 Taghiabadi, E. R., Fahland, D., van Dongen, B. F., & van der Aalst, W. M. (2013). Diagnostic information for
compliance checking of temporal compliance requirements. In International Conference on Advanced
Information Systems Engineering (pp. 304–320). Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-38709-
8_20
99 Tran, H., Zdun, U., & Dustdar, S. (2008). Modeling human aspects of business processes–a view-based, model-
driven approach. In Model Driven Architecture-Foundations and Applications, LNCS 5095 (pp. 246–261).
Springer. doi:10.1007/978-3-540-69100-6_17
100 van der Aalst, W. M. P. (2011). Process mining: discovery, conformance and enhancement of business
processes. Springer-Verlag Berlin Heidelberg. doi:10.1007/978-3-642-19345-3
101 van der Aalst, W. M. P. (2012). What makes a good process model? Software & Systems Modeling, 11(4),
557–569. doi:10.1007/s10270-012-0265-9
102 van der Aalst, W. M. P. (2016). Process Mining Data Science in Action (2nd ed.). Springer-Verlag Berlin
Heidelberg. doi:10.1007/978-3-642-19345-3
103 van der Aalst, W., Adriansyah, A., & van Dongen, B. (2012). Replaying history on process models for
conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery, 2(2), 182–192. doi:10.1002/widm.1045
104 van der Aalst, W. M. P., Weijters, T., & Maruster, L. (2004). Workflow mining: Discovering process models
from event logs. Knowledge and Data Engineering, IEEE Transactions on, 16(9), 1128–1142.
doi:10.1109/TKDE.2004.47
105 van Dongen, B., Alves de Medeiros, K., Verbeek, H. M. W., Weijters, J. M. M., & van der Aalst, W. M. P.
(2005). The ProM framework: A new era in process mining tool support. In Application and Theory of Petri Nets
2005, LNCS 3536 (pp. 444–454). Springer. doi:10.1007/11494744_25
106 van Lamsweerde, A. (2001). Goal-oriented requirements engineering: A guided tour. In RE'01 Proc. of the
Fifth IEEE International Symposium on Requirements Engineering (pp. 249–261). IEEE CS.
doi:10.1109/ISRE.2001.948567
107 van Lamsweerde, A. (2004). Goal-oriented requirements engineering: a roundtrip from research to practice.
In Proc. 12th IEEE International Requirements Engineering Conference, 2004 (pp. 4–7). IEEE CS.
doi:10.1109/ICRE.2004.1335648
108 van Lamsweerde, A. (2008). Requirements engineering: from craft to discipline. In Proceeding
SIGSOFT'08/FSE-16 (pp. 238–249). ACM. doi:10.1145/1453101.1453133
109 van Lamsweerde, A., Darimont, R., & Letier, E. (1998). Managing conflicts in goal-driven requirements
engineering. IEEE Transactions on Software Engineering, 24(11), 908–926. doi:10.1109/32.730542
110 van Lamsweerde, A., & Letier, E. (2000). Handling obstacles in goal-oriented requirements engineering.
IEEE Transactions on Software Engineering, 26(10), 978–1005. doi:10.1109/32.879820
111 Wang, J., Wong, R. K., Ding, J., Guo, Q., & Wen, L. (2013). Efficient Selection of Process Mining
Algorithms. IEEE Transactions on Services Computing, 6(4), 484–496. doi:10.1109/TSC.2012.20

36
112 Weijters, A. J. M. M., & Ribeiro, J. J. T. S. (2011). Flexible Heuristics Miner (FHM). In 2011 IEEE
Symposium on Computational Intelligence and Data Mining (CIDM) (pp. 310–317). IEEE CS.
doi:10.1109/CIDM.2011.5949453
113 Xu, X., Jin, T., Wei, Z., Lv, C., & Wang, J. (2016). TCPM: Topic-based clinical pathway mining. In
Connected Health: Applications, Systems and Engineering Technologies (CHASE), 2016 IEEE First International
Conference on (pp. 292–301). IEEE CS. doi:10.1109/CHASE.2016.17
114 Xu, X., Jin, T., Wei, Z., & Wang, J. (2017). Incorporating Topic Assignment Constraint and Topic Correlation
Limitation into Clinical Goal Discovering for Clinical Pathway Mining. Journal of healthcare engineering, 2017.
doi:10.1155/2017/5208072
115 Yan, J., Hu, D., Liao, S. S., & Wang, H. (2014). Mining Agents’ Goals in Agent-Oriented Business Processes.
ACM Transactions on Management Information Systems, 5(4), 1–22. doi:10.1145/2629448
116 Yu, E. (1995). Modelling strategic relationships for process reengineering. Doctoral dissertation, University
of Toronto, Canada.
117 Zeni, N., Kiyavitskaya, N., Mich, L., Cordy, J. R., & Mylopoulos, J. (2015). GaiusT: supporting the extraction
of rights and obligations for regulatory compliance. Requirements Engineering, 20(1), 1–22. doi:10.1007/s00766-
013-0181-8

37

You might also like