PA Cooperation TASE Submission Final Submission

1
Cooperative Product Agents to Improve Manufacturing System

Flexibility: A Model-Based Decision Framework
Ilya Kovalenko, Efe C. Balta, Dawn M. Tilbury, and Kira Barton
Abstract—Due to the advancements in manufacturing system TABLE I: Nomenclature Table

technology and the ever-increasing demand for personalized
products, there is a growing desire to improve the flexibility of General Terms
manufacturing systems. Multi-agent control is one strategy that PA Product agent
has been proposed to address this challenge. The multi-agent RA Resource agent
control strategy relies on the decision making and cooperation of DA-PA Direct, actively cooperative product agent
a number of intelligent software agents to control and coordinate PTA Priced Timed Automaton
various components on the shop floor. One of the most important Priced Timed Automaton Components
agents for this control strategy is the product agent, which
is the decision maker for a single part in the manufacturing x∈X State x in the set of states X
system. To improve the flexibility and adaptability of the product ppi Physical property
P rp Mapping of states to physical properties
agent and its control strategy, this work proposes a direct and
E Edges of a PTA
active cooperation framework for the product agent. The directly
σ∈Σ Event σ in the set of events Σ
and actively cooperating product agent can identify and actively
c∈C Clock c in the space of clocks C
negotiate scheduling constraints with other agents in the system.
K Cost function
A new modeling formalism, based on priced timed automata, and
cl , c g Local clock and global clock
an optimization-based decision making strategy are proposed as
val(cx ) Valuation of a clock (local/global)
part of the framework. Two simulation case studies showcase {l,g}
how direct and active cooperation can be used to improve the I Mapping of states to their invariants
flexibility and performance of manufacturing systems. Ag Mapping of states, edges, and constraints to agents
Xm Marked states
Note to Practitioners—An intelligent product is a product Constraint Formulation
in a manufacturing system that is able to make decisions
td,τ Delay transition of time τ
based on a set of specifications and affect its own production
tse Discrete transition via the e ∈ E
process. Intelligent products have often been proposed to address
Bb , Bg Bound Constraint and Interval Gap Constraint
the challenges associated with small-batch manufacturing and
tb,l,u Time limit for (bound, lower/upper interval gap)
highly customized production. Specifically, by using intelligent
Product Agent Specification
products, manufacturers would be able to complete small orders
without the need to reconfigure or reschedule operations in the Dl Deadline for a physical property
manufacturing system. However, one of the major challenges pmi Performance metric
in the implementation of this control strategy is the need to αp Performance weight for property p
develop methods that allow intelligent products to cooperate with pr Priority of DA-PA
machines, robots, and other products in a manufacturing system. Product Agent Models
In this work, we propose a novel direct and active cooperation AEM , ADM Environment and Decision Making models
framework that allows intelligent products to communicate and Knm , Ksf t Nominal cost and constraint violation cost
cooperate with other resources and products on the shop floor. Knrg Energy cost
Using the proposed cooperation framework, intelligent products x
γj i ∈ Γ j th slack variable for state xi
can resolve scheduling conflicts and work together to meet
H, S Hard and soft constraints
individual specifications (e.g., deadlines). Two case studies, a
Product Agent Decision Making
small job shop and a large semiconductor manufacturing system,
showcase how the proposed cooperation framework can be s A path
leveraged for different types of applications. ϕ(s, ξ) Solution sequence starting at ξ = (xi , ci , γi )
ΠG (ϕ) Projection of ϕ to a subspace G ⊂ ADM
L(·) Language of an automata
#(·) Number of nonzero elements
I. I NTRODUCTION
System-level control for the shop floor becomes increasingly
into the manufacturing system and the demand for person-
more complicated as more advanced technology is integrated
alized production increases [1]. Therefore, manufacturers are
1 Manuscript constantly looking to leverage different types of system-level
received...
2 This work is supported in part by the National Science Foundation control strategies to improve the flexibility of manufacturing
(including CNS Grant #1544678 and DGE #1256260), a gift from Rockwell systems and assist with the growing complexity of the man-
Automation, and NIST Award No.70NANB19H090. ufacturing system. Previously, multi-agent approaches have
3 I. Kovalenko is with Department of Mechanical Engineering and Industrial
& Manufacturing Engineering, Pennsylvania State University, University Park, been proposed as an effective method to improve the flexibility
PA. {iqk5135}@psu.edu D. M. Tilbury and K. Barton are with the and performance of complex manufacturing systems [2], [3].
Department of Mechanical Engineering, University of Michigan, Ann Arbor, A multi-agent control strategy consists of a number of au-
MI, USA. {tilbury, bartonkl}@umich.edu. E. C. Balta is with
the Automatic Control Laboratory, ETH Zurich, 8092 Zurich, Switzerland. tonomous, cooperating intelligent agents that control various
ebalta@ethz.ch components on the plant floor [2], [3]. These agents use data
2
proach are denoted as direct, active cooperation product agents

(DA-PAs) henceforth. For this cooperation approach, a DA-
PA explores the manufacturing environment and identifies any
constraints that can be negotiated with other agents (termed
soft constraints in this work). Using this knowledge of the
environment, the DA-PA finds a desired set of resource actions
and identifies any potential soft constraints that conflict with its
individual goals. If conflicting soft constraints are identified,
the DA-PA communicates and cooperates with the respective
agents in the system to resolve these conflicts. The integration
of the proposed direct and active cooperation can improve
the flexibility and adaptability of PAs, allowing them to meet
previously infeasible deadlines in the system.
The rest of the paper is organized as follows. Background
Fig. 1: An overview of the communication and cooperation information about multi-agent control and product agents is
between product agents, resource agents, and factory floor provided in Section II. Section III overviews the components
components. This figure extends the figure in [14] by high- of the DA-PA’s knowledge base used for the cooperation
lighting the cooperation between the RAs and PAs. Note that framework. Section IV describes the framework used for a
each product agent makes decisions with the goal of creating direct and actively cooperating product agent. Case studies for
desired products from raw-material or semi-processed parts. the cooperation framework are given in Section V. Section VI
from the system, information from other agents, and a set of provides concluding remarks and future directions.
individual goals to make decisions that ultimately determine
the performance of the manufacturing system. II. BACKGROUND
Product agents and resource agents are two important agents This section overviews general multi-agent control strategies
for the multi-agent control strategy [4]. A resource agent (RA) for manufacturing systems, provides the state-of the art in
is a high-level controller for a resource (e.g. robot, machine) the development of product agent models, defines existing PA
on the shop floor and a product agent (PA) makes decisions for cooperation strategies, and presents background information
a single part in the manufacturing system [5]. PAs cooperate about the priced timed automaton (PTA) modeling formalism
with other PAs and RAs when making decisions. A more used as part of the DA-PA approach.
detailed explanation of the PAs, RAs, and the cooperation
between the two types of agents is provided in Section II.
Frequently, a PA needs to cooperate with the other PAs A. Multi-agent Control for Manufacturing Systems
and RAs to find a sequence of resource actions to complete One of the major challenges in this area of system-level
its associated part’s production requirements and not come control is the development of control strategies that can rapidly
into conflict with other agents in the system. Prior work has adapt to unexpected disturbances on the shop floor. A system-
focused on cooperation between only PAs and RAs in the level controller has to handle unexpected machine faults or
system (PA-RA cooperation) [5]–[13]. For the PAs in those breakdowns, respond to changing customer orders, and quickly
architectures, the PA finds RAs that accomplish a specification integrate new machines into the shop floor, to name a few
in its process plan (see [14] for an overview of how the PA objectives [15]. One strategy for responding to these system
finds such RAs). After the PA finds the RAs, it schedules a disturbances is through the use of a centralized, hierarchical
set of material handling and manufacturing process operations control architecture [1], [16]. For this control strategy, the
necessary to accomplish its specification. This process is either disturbance is handled by a single controller that has access to
done in a first-come, first-serve basis [5], [7]–[10], [12], [13] all of the information in the manufacturing system. However,
or by placing bids for various resources [6], [11] and does not finding new schedules and coordinating all of the system
authorize the PA to directly negotiate with other PAs over the components becomes more difficult as the complexity of
resource utilization. This method of PA cooperation is passive, the manufacturing system increases [9]. Therefore, distributed
since the PA has to find a set of feasible resource actions strategies for manufacturing system-level control have been
without being able to negotiate with other PAs or RAs over proposed to increase the flexibility of the manufacturing sys-
existing scheduling constraints. Thus, the PA is not always tem [2], [3].
able to find a sequence of actions that satisfies all of the ex- The field of agent-based systems and multi-agent control
isting scheduling constraints and accomplishes its production has been developed over the past several decades to address
goals [13]. This limitation of the passive cooperation approach challenges in various application domains, from multi-robot
reduces the flexibility of the PA and, in turn, reduces the systems [17], [18] to power grids [19], [20]. In the area
flexibility and adaptability of the multi-agent control strategy. of manufacturing, agent-based technology has been used at
The main contribution of this paper is a novel decision various levels of production, from the supply chain [21]–
making approach that uses direct and active cooperation for [23] to the factory floor [3], [24]. For system-level control of
PAs in a multi-agent controlled manufacturing system. The manufacturing systems, a wide variety of multi-agent architec-
PAs that use the proposed direct and active cooperation ap- tures have been developed [2], [25], [26]. These system-level
3
multi-agent architectures leverage the local decision-making of requirements and not come into conflict with other agents in
agents to improve the response of the manufacturing system the system. To put existing PA cooperation strategies into
if there are unplanned disturbances in the system [9], [14] or context, the following definitions are provided for direct,
to increase the customization capabilities of the system [13]. indirect, passive, and active cooperation.
Most of these architectures identify the roles and respon- Using the definitions in [30], [31], we refer to direct coop-
sibilities of the different manufacturing system agents and eration between two agents as the utilization of a one-to-one
develop the communication requirements for this control communication link between the agents for the exchange of
strategy. Therefore, as part of these architectures, a number information. Indirect cooperation between two agents implies
of agents have been proposed to represent customer orders, that agents pass information and cooperate with each other
parts in the manufacturing system, resources (e.g. robots through other agents or through another controller. Using the
and machines) on the shop floor, and other components [4]. general agent cooperation descriptions found in [32], [33],
However, for real-time system level control during production, we define that a passively cooperating agent can only request
a large majority of the architectures rely on the cooperation actions that do not conflict with decisions from other agents.
and decision making of two types of agents: product agents On the other hand, actively cooperating agents can identify
(PAs) and resource agents (RAs) [1], [4]. conflicting actions and, through negotiation, find resolutions
A resource agent (RA) is a high-level controller for a for these conflicts during decision making.
resource on the shop floor (see [5], [9], [27] for several Product agent to resource agent (PA-RA) cooperation is
examples of RAs). The goal of the RA is to safely and often direct and passive. PAs directly communicate to the RAs
efficiently schedule and complete logistic or manufacturing in the system. However, during this direct communication, PAs
tasks in the system. The RA captures the capabilities and can only request resource actions that fulfill the scheduling
current status of the associated resource by gathering data constraints provided by the RA. Examples of this type of
from the physical resource on the factory floor. The RA also cooperation can be found in most multi-agent frameworks for
sends high-level commands to the associated physical resource manufacturing systems [5], [8]–[10], [12]–[14], [34], [35].
(e.g. pick up a specific part, start a particular manufacturing Most of the product agent to product agent (PA-PA) coop-
operation, etc.), communicates information about the resource eration is indirect and passive [8], [9], [13], [14], [35]–[38].
to other agents, and cooperates with different agents to achieve In these examples, a PA captures the plans of other PAs in
its goals. the system through information provided by another agent.
A product agent (PA) makes decisions for a single part in For example, when queried by a PA, RAs provide other PA
the manufacturing system [5]. The goal for the product agent schedules as scheduling constraints that cannot be violated.
(PA) is to schedule and request various operations from the These types of multi-agent architectures are usually first-come-
system resources based on customer specifications. The PA first-serve for the PAs in the system. Therefore, there is never
receives information about the status of the physical part and direct communication between PAs in the system and there is
the capabilities of the manufacturing system from the RAs. no negotiation over the scheduling constraints.
Based on the information received from the RAs and a set Direct and active cooperation would allow a PA to negotiate
of production requirements [25], the PA makes a decision to: with other agents when the PA identifies any conflicting events.
(1) obtain the capabilities and plans of other PAs and RAs This type of cooperation provides increased flexibility for the
in the system, (2) plan and schedule future resource actions, PA, allowing the agent to more effectively achieve its goals
or (3) request the execution of a resource action. Figure 1 and react to unexpected disturbances.
shows the general flow of information for product agents,
resource agents, and the factory floor for a multi-agent control C. Model-Based Product Agents
architecture framework. To enable direct and active PA cooperation, it is necessary
Integrating intelligent PAs into manufacturing systems leads to develop and improve the intelligence of the product agent.
to several benefits to customers and manufacturers. From In a number of multi-agent architectures, PAs use rule-based
the perspective of the customer, the integration of intelligent reasoning to make planning and scheduling decisions in a
PAs creates a more customer-centric approach to manufac- manufacturing system [5], [39]–[41]. However, it is difficult
turing [28]. Customers are able to tune the parameters of to scale these rule-based reasoners to more complex manu-
individual PAs to ensure that the produced parts meet all facturing systems, where a PA has to cooperate with various
of their desired specifications. From the perspective of the different PAs and RAs in the system.
manufacturer, intelligent PAs can enable more efficient recon- Recent work has looked at developing model-based reason-
figuration and improve system flexibility as new manufacturing ing to develop the decision making of PAs [12], [13], [38],
system schedules can be developed and implemented faster by [42], [43]. Model-based PAs use a model of the system and
enabling data analysis and control at the product level [29]. optimization techniques to drive the decision making of the
PA [13]. These model-based PAs store and update a model
of the manufacturing environment in their knowledge base
B. Product Agent Cooperation [27], [44]. Using this model of the manufacturing environment
When planning and scheduling future resource actions, a PA and a set of goals provided during initialization, the PAs
must cooperate with both PAs and RAs to find a sequence of autonomously run optimization algorithms to decide which
resource actions to complete its associated part’s production actions to schedule with the RAs in the system.
4
Finite state machines (FSMs) [45] are frequently used to when the edge is traversed, while the states have cost rates,
encode the capabilities of the manufacturing systems in model- and steadily accumulate cost over time.
based PAs [12], [13], [42], [43], [46]. For the FSMs used in
model-based PAs, the states represent potential locations and Definition 1 (Priced Timed Automata). A PTA A is defined
completed manufacturing operations (i.e. new features [47]) as an 8-tuple A = (X, C, Σ, E, I, R, P, x0 ), where
for the part in the manufacturing system. The events in the • X = {x0 , x1 , ..., xnx } is a finite set of states of the PTA
FSMs represent resource actions (e.g. logistic or manufactur- • C = c0 × c1 × ... × cnc = Rn≥0 c
is the clock state space
ing operations) that allow the part to transition between the • Σ = {σ0 , σ1 , ..., σnσ } is a finite set of events
states in the system. • E ⊆ X × B × Σ × X is a finite set of edges
Other models that have been used to encode manufacturing • I : X → B is the invariant operator
system capabilities include Markov Decision Process (MDP)
• R : E × C → C is a reset operator
models [38], [42], [43], [48] and Petri nets [9], [49]. The MDP
• K : X ∪ E → [0, ∞) maps the locations and edges to
models used by PAs are extensions to PA FSM models that
their costs
also capture some of the stochastic elements of manufacturing
systems, e.g., processing time [38], [42], [43], [48]. Petri nets • x0 ∈ X is the initial state
models have also been used for decision making by PAs [9], The guards are embedded in the edge definition, with B
[49]. The tokens in these Petri net models were used to track denoting constraints on the clocks for each edge, i.e., the guard
the availability of raw material and to ensure that another agent conditions. A transition is a formal description of a change in
(a task agent) completes a requested task [49]. Note that it is the system state and the transition type. There are two types
possible to create FSM models that would be equivalent to the of transitions, discrete transitions (changes in location) and
Petri net models used by the PAs in [9], [49]. delay transitions (changes in clock values). Let ty,z denote
All of these models encode the scheduling constraints of the a transition with y ∈ {s, d} denoting if the transition is a
PA as hard constraints, i.e. these constraints cannot be violated discrete (s) or a delay (d) transition, and z = {e, τ } where
at any point by the PA. However, these scheduling constraints ty,e
e ∈ E and τ ∈ R. We notate transitions with x −−→ x′ ,
in a manufacturing system are flexible in certain cases and
where x, x′ ∈ X. We provide further definitions on the clock
can be relaxed through negotiation with other PAs and RAs
structure before formally defining the two types of transitions.
in the system. Current model-based PAs do not capture the
We utilize a two-clock structure in this work. The two-clock
negotiable actions, i.e. soft constraints, in their planning and
state space is given as C = cl × cg through the rest of the
scheduling models.
paper, noting that additional clocks may be added for various
D. Priced Timed Automata applications. The local clock, cl , represents the time spent at
a single state and, thus, is always reset to 0 when entering
This work extends existing model-based PAs for capturing
a state via a transition in the system. The global clock, cg ,
the constraints that can be negotiated by the PAs, i.e., the soft
represents the absolute time for the system and is never reset in
constraints. Since most of the models used by existing PAs
the system. Additionally we use the valuation operator val(·)
are either FSMs or can be stated as FSMs, we will extend the
to denote the value of a clock. The valuation operator captures
FSMs modeling formalism to capture the soft constraints in
the values of clocks for a state x ∈ X at the time when a
the system. Specifically, we propose to leverage the priced
discrete transition corresponding to an edge e ∈ E is taken
timed automaton (PTA) modeling formalism [50], [51] to
from the state x to a new state x′ . For example val(cxg ) denotes
extend existing PA models and capture the PA’s scheduling
the value of the global clock at state x at the time of transition
constraints.
to a new state x′ by taking an edge. Then, the two types of
Here we provide a formal definition of PTA, based on [51].
transitions for PTA are defined as follows.
A PTA is composed of a set of states, X, connected by a set of
ts,e
edges, E. Each edge is labeled with an event, σ ∈ Σ. A PTA Definition 2 (Discrete Transition). A transition x −−→ x′
also has a set of clocks, continuous variables c ∈ C = Rn≥0 c
, is a discrete transition if e = (x, σ, x′ ) ∈ E, the guard is
where nc is the number of clocks in the system. All clocks satisfied ci |= g, and clock valuations are incremented as
′ ′
have the same constant, positive growth rate and an initial val(cxg ) := val(cxg ) and val(cxl ) := 0. Note that we say a
value of zero. discrete transition is “taken” when an edge in the model is
A finite set of Boolean indicator functions of clock values, traversed.
B, represents the constraints. For example, for a clock value
c ∈ C, the Boolean indicator function Ici ≤a (c) ∈ B evaluates td,τ
Definition 3 (Delay Transitions). A transition x −−→ x′ is
to true if and only if the ith element of c is less than or equal to a delay transition if x = x′ , the invariant I(x) is true, and
a. Each element of B is either an invariant for a location, which ′
clock valuations are incremented as val(cxg ) := val(cxg ) + τ
must evaluate to true for the system to occupy the location, or ′
and val(cxl ) := τ .
a guard on an edge, which must evaluate to true to traverse an
edge. Reset maps on edges set clocks to predetermined values Note that by explicitly defining the two transition types with
when the edge is traversed. the specific clock valuation updates, we implicitly defined a
Additionally, a PTA has costs on states and edges. Costs on reset operator, R, for the PTA. We omit R in later definitions
edges are discrete increments added to the total running cost for brevity.
5
III. DA-PA K NOWLEDGE BASE are stored in the environment model of the DA-PA. A formal
This section provides a description of the direct and actively description of their encoding in the environment model is
cooperating product agent’s (DA-PA’s) knowledge base (i.e., provided in Section III-B.
the goals, environment model, and decision making model). A set of performance weights is provided during the DA-
PA’s initialization. These performance weights are computed
offline based on the customer order or the manufacturer
A. Goals requirements. Each performance weight corresponds to a met-
During initialization, a direct, actively cooperative product ric. Formally, the set of performance metrics is defined as
agent (DA-PA) is provided a process plan, exit plan, perfor- {αpi ∈ R≥0 | pi ∈ P M }. The magnitude of the weight
mance weights, and the priority of the associated part with represents the relative importance of the corresponding metric
respect to other parts in the system. for the DA-PA. For example, if a DA-PA favors minimizing
1) Process Plan: The process plan is formulated based on material cost over energy usage, the following relation will
a customer order and consists of an ordered list of desired hold αmaterial > αenergy . The DA-PA uses the performance
physical properties for the associated part with deadlines [25]. weights and metrics during its decision making to identify
This ordered list representation of the process plan captures the desirable resources and resource actions in the system.
precedence constraints of product features and is often used to 4) Agent Priority: The DA-PA is provided an importance
specify the required processes for a part in the manufacturing value, pr ∈ N, which represents the priority of the associated
system [47]. Physical properties include part locations or part physical part when compared to other parts in the system. An
features, e.g. “part at exit loading dock” or “part has hole importance value of 1 signifies that the part is of the highest
with taper” [13], [47], [52]. Formally, the process plan is priority. Therefore, a part i has a higher priority than part j if
defined as: P lan = (P Pd , Dl), where P Pd = (pp1 , pp2 , ...) pri < prj .
is the ordered list of desired physical properties ppi , and Note that while the process plan, performance weights, and
Dl : P Pd → R≥0 is the latest allowable finish time for a agent priority are all part of the DA-PA’s individual goal, they
property. are linked to the overall performance of the manufacturing
The process plan is developed offline and provided to the system. The process plan is created based on customer and
DA-PA (e.g. [53]). Since the plan is computed offline and, manufacturer specification (including order of desired physical
additionally, to ensure that communication between agents is properties and deadline) [25], [53]. The performance weights
kept in a local neighborhood [14], we assume that the DA-PA can be set by the manufacturer to prioritize certain actions
accomplishes the process plan one physical property at a time. for the DA-PA (e.g. use less energy or material). Finally, the
Therefore, the DA-PA will always look to complete the next agent priority can be set by the needs of the manufacturer and
unfinished property in the process plan. Note that the DA- based on the cost-benefit of the customer order. For example,
PA keeps track of the completed physical properties and can if a customer sends in a highly profitable order with harsh
identify the next unfinished physical property when required. penalties for production delays, the DA-PAs associated with
Future work will focus on encoding the process plan in the order would be given a high priority.
other modeling formalisms, such as finite state machines Future work will look at developing effective methods to
and priced timed automata. Encoding the process plan as integrate higher-level controllers, such as a Manufacturing Ex-
an automata will improve the expressiveness of the process ecution System (MES), with DA-PA performance weights and
plan, allow greater product personalization, and enhance the agent priority parameters. Since these high-level controllers
DA-PA decision making flexibility. For example, an automata have access to a global view of the system, they would be able
representation would allow for encoding partial ordering of to supervise the DA-PAs and tune the performance weights
production steps. Consider three processes p1, p2, and p3. and priorities to formally guarantee that the behavior of the
By defining appropriate guards and invariants, we can specify DA-PAs meet the system-level manufacturing requirements.
that both p1 and p2 to be completed before p3, but no strict
specification on the ordering between p1 and p2. B. Environment Model
2) Exit Plan: The DA-PA is initialized with an exit plan The capabilities and scheduling constraints of the local
that is used if the DA-PA cannot find a sequence of resource manufacturing environment are captured by the DA-PA’s envi-
actions to fulfill the process plan [13]. The DA-PA can exit ronment model [13], [55]. The DA-PA updates its environment
the manufacturing system by calling an exit agent in the exit model as other agents (RAs and PAs) provide the capabilities
plan. For example, an exit agent can be a human agent [54] and scheduling constraints, as described in Section IV-A. The
who can retrieve the part from the manufacturing system. The environment model is defined as the following tuple:
decision of the DA-PA to exit the system is discussed in detail AEM = (X, P rp, x0 , Σ, E, C, Γ, I, Knm , Ksf t , Ag):
in Section IV. X = {x1 , x2 , ...} is the set of states for the associated
3) Performance weights: There are various, measurable physical part
performance metrics, P M = {pm1 , pm2 , ..., pmn }, for re- P rp : X → P P maps states to physical properties
sources on the plant floor, e.g. energy and material cost. These x0 ∈ X is the current state of the part
metrics are tracked and stored by the associated RAs in the Σ = {σ0 , σ1 , ...} is a finite set of events, where each event
system. The RAs share these performance metrics when they represents the start or completion of a resource action (e.g.,
communicate their capabilities with the DA-PA. These metrics a logistic or manufacturing operation)
6
E ⊆ Q × Σ × Q is a finite set of edges each state. Therefore, all of the constraints in the model are
C = {cl , cg } are the local and the global clocks stored in the invariant mapping, I. This mapping links a state
x
Γ : {..., γ1xi , γ2xi , ..., γ1 i+n , ...} is the set of slack variables, to all of its constraints.
xi
where γj ∈ R is the j th slack variable for state xi . All time constraints, B, are logic expressions [58]. There are
I : X → B[val(C), val(Γ)] maps states to their constraints two types of constraints used in the environment model: bound
as a function of the clock and slack variables constraint and interval gap constraint. The bound constraint is
Ag : X ∪ E ∪ B[val(C), val(Γ)] → Agents maps states, a time limit that represents an absolute limit when the part can
edges, and constraints to corresponding agents enter a state or leave a state. The gap constraint represents an
Knm : X ×P M ×val(C) → R≥0 maps states, performance interval when the part cannot be at a certain state. Formally,
metrics, and valuations of clocks to nominal cost values the two constraints are defined as follows:
Ksf t : X × P M × val(Γ) → R≥0 maps states, performance
Definition 4 (Bound Constraint). A bound constraint, Bb , for
metrics, and valuations of slacks to cost-of-constraint-
state, x, is satisfied (i.e. evaluates to true) if and only if a
violation values
clock valuation (either local or global) at the state, val(cxb ),
where X, P rp, x0 , E are the discrete event dynamics, C, Γ, I for cb ∈ C satisfies:
encode the time-based state constraints, Ag is the agent
val(cxb ) ▷◁ tb ± aγ val(γbx

association function, and Knm , Ksf t are the state costs for (1)
the DA-PA. The vectors val(C) and val(Γ) are valuations
of the variables C and Γ for each state in the system. where tb ∈ R≥0 is the time limit for the clock, aγ ∈ [0, 1]
These valuations capture the values of the clocks and slack is the hardness coefficient, γbx is a slack variable associated
constraints for a state x ∈ X at the time when a discrete with the constraints on state x, and ▷◁ is one of the following
transition corresponding to an edge e ∈ E is taken from state logical operators: <, ≤, =, ≥, >.
x to another state. To ensure unique clock and slack valuations, Note that a bound constraint is a hard constraint if aγ = 0.
we enforce the AEM to be acyclic, while noting that cyclic
graphs can be “flattened” into acyclic graphs efficiently [56]. Definition 5 (Interval Gap Constraint). An interval gap con-
AEM is an extension of the PTA modeling formalism pre- straint, Bg , for state x and interval (tl , tu ) is satisfied if and
sented in Section II-D. A description of how the manufacturing only if the local and global clock valuations at the state,
environment is mapped to each component is discussed in the val(cxl ) and val(cxg ), satisfy:
rest of this section. Note that there are several differences when Z1 ∨ Z2 (2a)
AEM is compared to the PTA from Definition 1. As previously
Z1 := val(cxg ) ≤ tl + aγ,l val(γ1x ) (2b)
discussed, the reset operator, R, is left out for brevity. Only
states (not edges) of AEM have constraints and these state Z2 := val(cxg ) − val(cxl ) > tu + aγ,u val(γux ) (2c)
constraints are encoded in the invariant, I. Finally, there are where tl , tu ∈ R≥0 and tu > tl , aγl , aγu ∈ [0, 1] are
a number of extensions to the constraints and costs of AEM , the hardness coefficients, γlx , γux are unique slack variables
which are discussed here. associated with the constraint, and ¬, ∧, ∨ are logic operators
1) Discrete event dynamics: X, P rp, x0 , E represent the
NOT, AND, OR, respectively.
capabilities of the manufacturing system. X are the states of
the physical part in the system. Each state is a combination As defined previously in Section III-B, val(cxg ) indicates the
of several physical properties of the part. Physical properties global clock time when a transition is taken from state x to
are locations and physical composition of the part (e.g. “part another state. Z1 is satisfied when the transition from state x
at machine” or “part feature completed”) [52] or operations is taken before the lower bound, tl . Additionally, cxl is reset to
performed on the part (e.g. “moving the part” or “working 0 after each transition (defined in Section II-D). Thus, val(cxl )
on a manufacturing process”). P rp maps each state of the is the time between the transition to state x and the transition
environment model to one or more of these physical properties. from state x, i.e., the amount of time the part is in state x.
A current state, x0 , is updated when an RA informs the DA-PA Therefore, val(cxg ) − val(cxl ) represents the global clock time
that it has started or finished a resource action. Each event, Σ, when a transition is taken to state x from another state. Z2 is
represents an instantaneous (with delay 0) start or completion satisfied when the transition to state x is taken after the upper
of a manufacturing or logistic operation [57]. bound, tu . Z1 ∨ Z2 is satisfied when the transition is taken
2) Time-based Constraints: The state constraints, B, limit from x before the interval or a transition is taken to x after
the amount of time that the associated part can be in the the interval, i.e., the part is not in the state at that interval.
state (e.g. how long the part can stay at a location). These Note that the constraints should encode some of the stochas-
constraints are split into hard and soft constraints, B = tic properties of the system. For example, for manufacturing
H[val(C)] ∪ S[val(C), val(Γ)]. Hard constraints, H[val(C)], operations, the maximum operating time (including any uncer-
cannot be violated by the DA-PA (e.g. minimum time required tainties) can used to create the bound constraint for the system.
to complete a manufacturing operation). Soft constraints, An example is provided in Section III-C.
S[val(C), val(Γ)], contain slack variables, Γ. As described 3) Transitions: The objective of the DA-PA’s decision
in Section IV-C, these soft constraints may be violated by the making is to find clock and slack variable valuations that
DA-PA through negotiation with other agents in the system. satisfy all of the state constraints in the system. To satisfy these
Note that there is no limit for the number of constraints at constraints, the DA-PA must choose a sequence of discrete
7
transitions from E that move the associated part from one state obtaining the nominal cost parameters for each state. While
x to a new one x′ and delay transitions that keep the part at the methods to develop and update the necessary RA models
the same state x but take a finite time τ ∈ R. The definitions are out of the scope of this paper, we envision that existing
of these two transitions were provided in Section II. Detail modeling frameworks, e.g., [14], [60], can be extended to
about how a DA-PA chooses these transitions is provided in calculate Anm , bnm . For the soft constraint violation cost, the
Section IV-B. agent associated with each soft constraint, agsc , provides the
aksf t , bksf t to the DA-PA. aksf t , bksf t represent how undesirable
Remark 1. It is always possible to denote delay and discrete
the constraint violation would be to agsc , in terms of a per-
transitions in alternating sequence since delay transitions are
formance metric k. Currently, the cost of constraint violations
additive and we can define a zero time delay transition between
is determined using feedback from an operator or a subject
two consecutive discrete transitions.
matter expert. Future work will develop methods to effectively
4) Agent association function: The DA-PA builds the en- estimate these soft constraint violation costs for both PAs and
vironment model by requesting capabilities from the RAs in RAs. To ensure desirable performance of other agents, the DA-
the system [10], [13]. If the RAs provide their capabilities as PA will minimize the cost of constraint violations ensuring that
PTA-based models, the DA-PA can put together these models constraints are only violated if necessary (see Section IV-B).
to create the proposed environment model [59]. However, to
enable the proposed cooperation framework, the DA-PA must
C. Decision Making Model
keep track of the agents associated with the components in the
environment model. A decision making model is the following tuple:
The agent association function, Ag, maps states, edges, and ADM = (X, P rp, x0 , Σ, E, C, Γ, I, Ag, Xm , K), where
constraints to an associated agent. The states can be mapped (X, P rp, x0 , Σ, E, C, Γ, I, Ag) are obtained from the envi-
to one or more agents. If a state is mapped to more than one ronment model, Xm ∈ X is a set of marked states, and
agent, it is a shared state [14]. The presence of these shared K : X × val(C) × val(Γ) → R≥0 denotes the state costs.
states allows for the modular composition of the environment The set of marked states, Xm , represents desired states for
model. The edges are mapped to a single agent so that the the DA-PA [45]. A discussion of how the decision making
DA-PA can identify which agent to contact when requesting model is put together is presented in Section IV-A.
a discrete transition. Similarly, constraints are mapped to a Figure 2 shows an example of a decision making model for
single agent, representing which agent is responsible for the a DA-PA in a manufacturing system composed of 2 robots
scheduling constraint in the system. (R1 and R2), 2 machines (M1 and M2), and 2 buffers (B1
5) State costs: The nominal cost, Knm pi
(xj ), for perfor- and B2). There are 4 RAs in the system, one for each robot
mance metric pi is the cost for a part to stay at a state with and machine. The buffers are used by the robots to pick and
respect to the corresponding metric (pi ). Formally, the nominal place parts in the system and do not have an associated agent
cost for performance metric pi for a single state xj is: in this example. B1 is used by R1 and B2 is used by both
R1 and R2. The machines complete a manufacturing process
pi
Knm (xj ) = Anm · val(C x ) + bnm (3) (P1) required as part of the DA-PA’s process plan. This initial
state, x0 , and marked states, Xm are shown in Figure 2b.
where Anm ∈ R2≥0 , bnm ∈ R≥0 , and val(C x ) =
Note that in Figure 2b the initial state represents the current
[val(cxg ), val(cxl )]T are the valuations of the clocks for state x. location of the part associated with DA-PA and the marked
pi
Similarly, the soft constraint violation cost, Ksf t (xj ), is the states represent the next desired process based on the DA-
cost for the part to stay at a state if there is a constraint PA’s process plan. The dashed rectangles outline which states
violation. The soft constraint violation cost for performance and events are associated with a specific agent in the agent
metric pi for a state is: association function, e.g. states x1 , x2 , x3 and events σ1 , σ2
m are associated with the R1 agent.
x x
pi
X
Ksf t (xj ) = aksf t val(γk j ) + bksf t δ(val(γk j )) (4) Table II shows the physical properties, constraints, and costs
k=1 for each of the states in the example decision making model.
where m = |I(xj )| is the number of constraints for the state, All of the constraints (i.e., invariants) in Table II are bound
x
γk j is the slack variable associated with the k th constraint on constraints (see Definition 4). These constraints represent both
state xj , aksf t , bksf t ∈ R≥0 , ∀1 ≤ k ≤ m, and δ(·) denotes (1) desirable, but flexible, operating limits (soft constraints)
an indicator operator with 1 if the argument is nonzero and 0 and (2) physical/safety time-based limits (hard constraints)
otherwise. Note that if aksf t = 0 and bksf t = 0, then there is no for the resources in this example. For example, state x2
penalty for violating constraint k. Similarly, if the associated contains both a soft and a hard constraint. The soft constraint,
valuation of the slack variable is 0, then there is no violation val(cxl ) > tsf t x
R1,B2 − val(γl ), represents the desired operating
x x sf t
cost added, i.e. if val(γk j ) = 0, then δ(val(γk j )) = 0 and time limit, tR1,B2 , for R1 to take the part from B1 to B2.
xj k xj
val(γk ) + bsf t δ(val(γk )) = 0. Thus, the sum (4) is a If necessary, R1 can perform the material handling operation
summation of slack costs for the nonzero slack valuations. faster (i.e., violate the soft constraint) if the value of the slack
For each state, RAs use their own internal data-driven and variable is greater than 0 for that state, val(γlx ) > 0. However,
physics-based models to estimate the nominal cost parameters. as discussed later in this section, there is an energy penalty for
PAs obtain and update Anm , bnm after querying RAs and this constraint operation. In addition to the soft constraint, a
8
M1
B1 R1 B2 R2
M2
Machining Station Buffer State Edge
Material Handling Robot Robot Trajectory Marked State Initial State Indicator
(a) (b)
Fig. 2: (a) An overview of a system with 2 robots (R1, R2), 2 machines (M1, M2), 2 buffers (B1, B2). (b) A visualization
of the DA-PA’s decision making model for this system. The system has 4 resource agents representing R1, R2, M1, and M2.
R1 moves the part from B1 (x1 ) to B2 (x3 ). R2 moves the part from B2 (x3 ) to M1 (x6 ) or M2 (x9 ). M1 and M2 complete
manufacturing process 1 (P1), represented by x8 and x11 . The part is at B1 and needs P1 completed (x8 or x11 ) due to its
process plan. Note that x2 , x4 , x5 represent states when the part is moving and x7 , x10 represent states when the part is
undergoing a physical change via a manufacturing process. See Tables II and III for more information about states and events.
TABLE II: State properties, invariants, and costs for Fig. 2. TABLE III: Event descriptions for Fig. 2.
State, Properties, Event, (σ) Event Name Event, (σ) Event Name
Invariant, I(x) Cost, K(x)
(x) P rp(x)
Start moving to Finish moving to
σ1 , σ3 , σ5 σ2 , σ4 , σ6
x1 At B1 None αt Kt (x) B2/M1/M2 B1/M1/M2
sf t t Start P1 at Finish P1 at
Moving val(cx x
l ) > tR1,B2 − val(γl ) αt K (x)+ σ7 , σ9 σ8 , σ10
x2 nrg nrg
M1/M2 M1/M2
to B2 val(cxl ) > t nm
R1,B2 α e K nm (x) +Ksf t (x)
x3 / At B2/
None αt Kt (x)
but does not necessarily hold for all parameters and tasks.
x6 /x9 M1/M2 Assuming that the speed of operation and the energy cost are
x4 /
Moving
nrg
inversely proportional for Robot 2, the energy cost for moving
to val(cx nm
l ) > tR2,M 1/M 2 αt Kt (x) + αe Knm (x) to B2 is evaluated to the following: Knm nrg
(x) = cnm · val(cxl )
x5
M1/M2 nrg x
and Ksf t (x) = csf t · val(γl ), where cnm and csf t are energy
x7 / At M1/M2 nrg
val(cx nm
l ) > tM 1/M 2,P 1 αt Kt (x) + αe Knm
e (x)
usage coefficients. The penalty term, Ksf t (x), represents the
x10 P1 Working
extra energy required by the RA to accomplish this operation
x8 / At M1/M2
val(cx
g ) < td αt Kt (x) if it has to operate faster than its desired operating time limit.
x11 P1 Finished
Note that csf t > cnm to ensure that there is a penalty for
hard constraint for state x2 , val(cxl ) > tnm violating the soft constraint. Future work will test the proposed
R1,B2 , represents the
absolute minimum time, tnm , required by R1 to take the approach when there are more complex relationships between
R1,B2
part from B1 to B2. Note that tnm the speed and the energy cost of accomplishing a task, such
R1,B2 is the upper bound of the
time that it takes for R1 to perform this action. In this example, as described in [61], [62].
tnm Table III provides information about the events in the
R1,B2 is a guaranteed time limit for the robot to finish moving
a part from B1 to B2, even with stochastic uncertainties for system. Note that each event is associated with a single agent
the action. The constraints for states x4 , x5 , x7 , and x10 also and the DA-PA has to (1) schedule time at the resource, and
represent similar physical limits for corresponding resource (2) requests these events at a scheduled time. For example, the
operations. Furthermore, there is a hard constraint on x8 and DA-PA can send a request to the M1 agent to schedule the use
x11 , val(cxg ) < td , that represents a customer deadline of time of M1 at a specific time. Then, once the associated part arrives
t − d to complete the manufacturing process. at M1 (state x6 ), the DA-PA can send a request to start the
An overview of the costs is also provided in Table II. The P1 manufacturing process (event σ7 , “Start P1 at M1”) to the
state cost is a combination of two metrics: processing time and machine, thus transition the part to the next state, x7 . Further
energy expenditure. The processing time cost is the amount details on this process are provided in Section IV-D.
of time spent in that state: K t (x) = val(cxl ). The energy
expenditure cost is provided by the RAs and represents the IV. C OOPERATION F RAMEWORK
energy used by the RA for the part to stay at the state. For the The DA-PA uses the goals, environment model, and decision
purpose of simplification, we assume that RAs can save energy making model to find a new path, or sequence of resource ac-
when they take a longer time to accomplish tasks. Note that tions, in the system after receiving an update to its environment
this relationship can be true in specific regions of operation model. An update to the DA-PA’s environment model occurs
for both industrial robots [61] and manufacturing tools [62], either (1) when the DA-PA is provided information about
9
Receive update/request a recommendation or help the part exit the system [13].
from PA or RA
A. Model Creation
Update the environment model The DA-PA goes into the model creation phase after receiv-
Model ing an update for its environment model. In this phase, the
Creation Create decision making model DA-PA updates the environment model to capture the current
capabilities and constraints of the environment. Then, the DA-
PA creates the decision making model used to find the next
Solve optimization problem
sequence of resource actions.
1) Update the environment model: The DA-PA’s initial step
No in the model creation phase is to update the environment
Path found?
model using the information provided to the DA-PA by
Path Planning Yes another PA or RA in the system. Other agents can send
a message to the DA-PA to update any component of the
Soft Constraint No
Violation(s)?
AEM , i.e. (X, P rp, x0 , E, T r, C, Γ, I, Knm , Ksf t , Ag). These
updates represent changes in the manufacturing environment,
Yes e.g. new resources, updated scheduling constraints, changes
to the associated part’s current state, etc. While the commu-
Send constraint violation
request to agent(s)
nication and messaging protocols are out of scope of this
paper, examples of the modular composition of an environment
model can be found in [10], [14], [59].
No All violation(s) 2) Create the decision making model: The DA-PA com-
authorized?
bines AEM and its goals to create the decision making model,
Coordination
Yes ADM , by taking the following steps. To create ADM , the DA-
Send constraint violation PA computes a set of marked states and updates the constraints
confirmation to agent(s) and costs of AEM .
The DA-PA computes the set of marked states, Xm by
Scheduling obtaining the next unfinished physical property, ppd , from
Schedule Path not
its process plan. The DA-PA marks a state (i.e. adds the
path found state to the set of marked states in ADM ) if ppd matches
the physical property of a state in the environment model:
Fig. 3: An overview of the direct, active cooperation frame- ppd ∈ P rp(xi ), ∀xi ∈ Xm . Deadlines are incorporated in
work. The framework includes 4 steps: model creation, path ADM by adding a hard constraint, val(cg ) < Dl(ppd ), to the
planning, coordination, and scheduling. The gray boxes indi- constraints of the marked states, I(xi ), ∀xi ∈ Xm .
cate when the DA-PA communicates with other agents. To account for the priority of other the parts in the system,
the DA-PA updates the soft constraint violation costs from
the environment during initialization (2) another agent con- AEM before incorporating these costs into ADM . Let the
tacts the DA-PA with new information due to an unexpected DA-PA be denoted as ac and its priority as prac . Since
disturbance or (3) when RAs reply to a PA query for new each constraint is mapped to an agent with Ag, the DA-
information about the environment (e.g. [13]). Direct, active PA obtains a constraint’s associated agent and the agent’s
cooperation is used by the DA-PA to find paths in the system priority, ai , prai = Ag(Bi ). Soft constraints with a higher
that allow all of the agents to meet their goals, if possible. priority DA-PA (i.e. prai < prac ) are hardened by setting
Once a path is found, the DA-PA will start to schedule events the corresponding aγ = 0 (see Eq. 1). For soft constraints
with the RAs in the system. If a path is not found, the DA-PA with a lower priority DA-PA (i.e. prai > prac ), the penalty
will contact an RA to exit the system [13]. for constraint violation is removed by setting aksf t = 0 and
After the DA-PA receives an update to the environment bksf t = 0 (see Eq. 4). Soft constraints with an equal priority
model, the DA-PA goes through the 4 steps in the cooper- DA-PA (i.e. prai = prac ) are not changed.
ation framework: (1) model creation, (2) path planning, (3) The state cost, K(xj ), ∀xj ∈ X is a function that depends
coordination, and (4) scheduling. A high-level overview of the on the valuation of soft constraint violations and the clock
cooperation framework and the steps are shown in Figure 3 valuations. The DA-PA computes the state cost by weighing
and described in detail in this section. Once cooperation is the state cost for each metric in AEM , p1 , p2 , ..., pn , with its
finished, the DA-PA executes the plan developed during the performance weights, αp1 , αpi , ..., αpn . Formally, the cost for
scheduling phase through requests to the corresponding RAs. each state is defined as:
Note that if the DA-PA cannot find, schedule, or execute a Xn
pi pi
path, the DA-PA will try to re-build the environment model and K(xj ) = αpi Knm (xj ) + Ksf t (xj ) , (5)
retry each of these steps. If a DA-PA does not find a solution i=1
pi pi
within a certain number of iterations or within a bounded time, where Knm (xj ) and Ksf t (xj ) are defined in Section III-B.
pi
it will take the exit plan described in Section III-A to provide Note that Ksf t (xj ) is a summation of slack costs for nonzero
10
slack valuations. Thus, the state cost is dependant on the soft sequence defined by constraint (7b). Constraint (7c) ensures
constraint violation cost only if there is a constraint violation, that the path is in the language of ADM , ensuring that all
i.e., nonzero slack variable valuation. the constraints in ADM are satisfied and all the transitions
are well-defined. Constraint (7d) defines an upper bound on
B. Path Planning the number of constraint violations so the solution sequence
s∗ can violate at most ζ constraints. Note that we recover
After building the decision making model, the DA-PA solves
the total number of constraint violations in the optimal so-
an optimization problem to find the path on the decision
lution using #(ΠΓ (ϕ(s∗ ))). Constraint (7e) ensures that at
making model. A path on the decision making model PTA
least one of the marked states is in the solution path, i.e.,
is a sequence of transitions that will take the associated part
x ∈ ΠX (ϕ(s∗ )), x ∈ Xm ⊂ ADM . The choice of N in (7)
from its current state to a marked state in the decision making
depends on the computational time requirements of the DA-
model, while satisfying the constraints in the system.
PA. If there are no specific computation time requirements, the
Definition 6 (Path). A path s on a PTA is an ordered set DA-PA can use longest path length of the PTA to guarantee
of discrete and delay transitions in which the post-transition feasibility of the problem (see [63] for different choices of N
state of each transition is equal to the pre-transition state of in a receding horizon setting). In Section V-D, we show that
the subsequent transition. Thus a path can be notated as solving (7) by using the longest path of the PTA for N is
D E justified for a PTA with 50 states and up to 104 constraints.
s = td,τ1 ts,e1 td,τ2 ts,e2 ... td,τn ts,em (6) If the number of constraint violations, #(ΠΓ (ϕ(s∗ ))), is
and the path’s total cost is given by the sum of the costs for zero, then the DA-PA can schedule the obtained sequence of
all discrete and delay transitions in the path. actions without the need to negotiate over conflicting events,
as described in Section IV-D. If a solution contains a nonzero
As stated in Remark 1, a path will always start with a delay number of constraint violations, the DA-PA has to coordinate
transition, alternate between discrete and delay transitions, and with other agents to find a solution to the conflicting events
end with a discrete transition so that n = m in (6). prior to scheduling actions. More detail about coordination
Let ξi = (xi , ci , γi ) denote a shorthand for the state, clock is discussed in the following section. Finally, if a solution
values, and slack values respectively at the ith transition of to the above problem cannot be obtained or this optimization
the path s, s(i). Note that s(i) can be either a delay or a problem is not solved without a bounded time, the DA-PA will
discrete transition. To pose the path planning problem as an move to the scheduling step without a new plan. As described
optimization problem, we additionally define in Section IV-D, the lack of a new plan will force the DA-PA
D s(1) s(2) s(N )
E to either find a new plan or call an exit agent.
ϕ(s, ξ1 ) = ξ1 −−→ ξ2 , ξ2 −−→ ξ3 , . . . , ξN −−−→ ξN +1
as the solution sequence, where ξ1 = (x1 , c1 , γ1 ) is the C. Coordination

initial state, clock values, and slack values respectively, and The DA-PA has to coordinate its optimal path with other
N = 2n is the total number or transitions of (6), with agents if there are any constraint violations found during its
n = m. We define the projection operator ΠG (ϕ(·, ·)) that path planning phase, i.e., #(ΠΓ (ϕ(s∗ ))) > 0. An overview of
projects the solutions sequence to a subspace G ⊂ ADM . the coordination procedure is shown in Figure 4. The DA-PA
For example ΠX (ϕ(s, ξ1 )) = {⟨x1 , . . . , xf ⟩ | xi ∈ ϕ(s, ξ1 )} finds the set of corresponding agents (PA or RA) for each con-
projects the solution subspace to the states in ADM . We use straint that has a non-zero slack valuation (violated constraint):
ϕ(s) instead of ϕ(s, ξi ) for brevity when the arguments are Agents = {Ag(Bv ) | Bv ∈ S[val(C), val(Γ)], val(γv,i ) >
clear from the context. 0, val(γv,i ) ∈ V al(Γv )}, where Γv are the slack variables
Thus, the DA-PA solves the following optimization problem associated with this constraint. Note that this mapping is
to find a cost-optimal path: not one-to-one as a single agent in the set Agents can be
N
X associated with multiple violated constraints.
s∗ ∈ arg min K(xi ) (7a) The DA-PA sends a violation request to each of the agents
sk
i=1 in the set Agents, where each agent in this set is either a PA or
s.t. ¯
xi ∈ ΠX (ϕ(sk , ξ)), (7b) an RA. For every state x∗i in the state sequence of the optimal
sk ∈ L(ADM ), (7c) path ΠX (ϕ(s∗ )), the DA-PA finds the corresponding delay
td,τj
¯ ≤ ζ,
#(ΠΓ (ϕ(sk , ξ))) (7d) transition td,τj such that (x∗i −−−→ x∗i ) ∈ ϕ(s∗ ). The violation
¯ ∩ Xm ̸= ∅ request contains all of the states in the optimal path and the
ΠX (ϕ(sk , ξ)) (7e)
corresponding delay transitions. Thus, the violation request
where L(ADM ) denotes the set of all feasible paths (i.e. the represents new scheduling constraints to be considered by each
language) of ADM , ξ¯ = (x̄, c̄, 0) is the initial state and clock agent, agv ∈ Agents. The contacted agent, agv , should be able
value, #(·) denotes the number of nonzero elements in the to understand this request, deliberate, and respond to the DA-
argument, ζ ∈ Z>0 is the number of allowable constraint PA. Examples of how other PAs and RAs respond to request
violations for the solution, and sk is a path as defined in violations is provided in the supplementary material [64].
Definition 6. The solution s∗ is a path that minimizes the If all of the contacted agents authorize their constraint viola-
cost function K(xi ), where xi are the states in the solution tions, the DA-PA sends a message confirming new scheduling
11
𝒂𝒈𝑫𝑨#𝑷𝑨 𝒂𝒈𝒗 a solution was not found in a bounded time, the DA-PA can
try to find a sub-optimal optimal path for the decision-making
Model model. The methodology to find this sub-optimal path is not in
Creation the scope of this paper and will be considered in future work.
Path Another method that the DA-PA can use to find a new plan is
Planning
alt to try to capture some changes in the manufacturing system by
Soft Violation Request Violation re-exploring and re-building the environment model [14]. This
Constraint Deliberation method allows the DA-PA to address various disturbances,
Violations? Coordination
Respond to Request
faults, and failures in the manufacutring system [64]. Finally,
if a DA-PA is not able to find a solution within a bounded
alt
time or a bounded number of iterations, it will have to exit
Confirm (Adjust) the system via the exit plan by calling an exit agent provided
Authorized?
Violations Schedule
in its exit plan (described in Section III-A).
To:
Schedule E. Execution
RAs
Once the plan is calculated by a DA-PA and confirmed by

the RAs, the DA-PA executes the plan by requesting events
Fig. 4: A sequence diagram for DA-PA coordination. from the RAs at the planned times. Starting at the initial state
constraints that correspond to the delay transitions in its of the solution sequence, x1 , the DA-PA transitions between
optimal path to each of the contacted agents. This confirmation states by sending an event request to the RAs in the system
message is handled by each respective agent and can result in at the planned time. To transition between states x∗i and x∗i+1 ,
the need for this agent to re-schedule its own plans. After the DA-PA sends an event request, σe , to the agent, Ag(e)
the confirmation message is sent out, the DA-PA will start to where e ∈ E is the corresponding edge in ts,e ∈ s∗ , i.e.
ts,e
schedule the path with the RAs, as described in Section IV-D. (x∗i −−→ x∗i+1 ) ∈ ϕ(s∗ ), and σe is the event associated with
Note that any agent can deny, i.e. not authorize, a constraint that edge. For example, after waiting at the initial state x1
violation request. Examples of why another agent would deny for τ1 , the DA-PA will request the event σ1 from Ag(e1 ),
a constraint violation request are provided in the supplemen- where e1 is the edge of the first discrete transition of s∗
tary material [64]. If the constraint violation is not authorized, and σ1 is the associated event. To improve the robustness
the DA-PA may still be able to find a path in the environment of the DA-PA, the optimization for (7) is run after an event
that can fulfill its goals. If a constraint violation is denied is requested by the DA-PA. If the optimal path s∗ changes
(i.e., non-authorized), the DA-PA will update the environment between subsequent solutions of (7), the DA-PA reschedules
model to avoid violating the non-authorized constraint. Non- the time constraint schedules with the corresponding RAs,
authorized constraints are hardened by setting the hardness accordingly. Re-running the optimization after every event
coefficient in the constraint to 0 (see Definitions 4 and 5) until ensures that the DA-PA will be able to adapt its plan in the
the environment model is re-updated at a future time. Then, presence of small changes (e.g., due to system uncertainties).
the DA-PA goes through each of the steps in the cooperation Note that for more disruptive changes to the manufacturing
framework in Figure 3 with the new hardened constraints. system (e.g., machine failures, changes to customer order),
the DA-PA will need to re-build the environment model and
D. Scheduling go through each step of the decision making framework, as
If the DA-PA finds a path s∗ and all of the constraint described in [64]. The DA-PA continues to request the discrete
violations are authorized by other agents, the DA-PA starts to transitions until it completes all of the events in the path s∗ .
schedule the transitions with the RAs in the system. Similar Since the DA-PA plans for a single manufacturing process
to the previous step, for every state x∗i , the DA-PA finds the at a time, the path s∗ ends at a marked state x∗i ∈ Xm that
corresponding delay transition, td,τj . Then, the DA-PA sends signifies the completion of a manufacturing process. After all
a scheduling request to the agents associated with the state, of the transitions in the solution sequence are finished, the DA-
x∗i ∈ Ag(x∗i ). The request is an interval, [lxi , uxi ] with l PA finds the next unfinished manufacturing processes and then
and u representing the resource utilization start time and end continues following the steps in the direct, active cooperation
time, respectively. Note that for the initial state in the solution framework shown in Figure 3 (model creation, path planning,
sequence, x1 , the resource utilization start time is 0, lx1 = 0, coordination, and scheduling).
and the end time is the valuation of the global clock for
the initial state, ux1 = τ1 . For subsequent delay transitions, F. Benefits of the Proposed Methodology
the enter time corresponds to the exit time of the previous The direct, active cooperation methodology focuses on
delay transition, lxi = uxi−1 and the exit time is calculated as improving the conflict resolution of multi-agent controllers by
uxi = lxi + τi . After checking for any scheduling conflicts, extending the decision making process of a single PA, i.e.,
the RAs confirm or reject each scheduled actions. none of the other agents have access to any of the models,
If a DA-PA goes to the scheduling stage without a new plan, plans, or performance weights of the DA-PA. This direct, ac-
the DA-PA can try to find a new plan or call an exit agent. If tive cooperation approach has several benefits when compared
12
to common conflict resolution approaches in existing multi- TABLE IV: Time limits for resource actions in case study 1:
agent systems for manufacturing that use a supervisor agent small manufacturing system simulation
to resolve conflicts, e.g., [6], [35]. One of the major benefits is Resource Program Time (ticks)
that it is not necessary to combine multiple, distinct, models Robot 1 Move to Buffer 2 15
from different PAs and RAs during cooperation. These various Robot 1 Move to Buffer 2 (Fast) 5
models may be difficult to combine since they may have been Robot 2 Move to Machine 1 17
gathered at different times and may represent different sets of Robot 2 Move to Machine 2 17
resources in the manufacturing system [14]. A second benefit Machine 1 Process 1 100
of this approach is that the optimization will scale reasonably Machine 2 Process 1 150
with more PAs in the system, as shown in Section V-D. Finally,
FSM class described in [13] was developed. This custom PTA
the DA-PA retains a lot of the information internally since
class was used to encode the DA-PA environment model, the
only violated constraints are shared with other agents. If a
DA-PA decision making model, and the capability model used
manufacturing system contains a large number of different PAs
by the RAs when providing capability and scheduling infor-
that represent various parts from different customers, it may be
mation during exploration (see [14] for a detailed explanation
highly beneficial and preferable for a customer to ensure that
of the exploration process). The communication framework
all of their process plans, models, and performance weights
native to Repast Simphony was used to encode messages and
are not shared with other agents.
enable cooperation between agents.
A Cost Optimal Reachability Analysis (CORA) solver was
V. C ASE S TUDIES developed to solve the optimization in (7). CORA has been
This section presents how direct, active cooperation can be previously used to find cost-optimal paths for PTA models.
used to improve the performance of manufacturing systems Two software tools that have been used to solve CORA for
using simulations of the manufacturing environment. The PTAs are UPPAAL CORA [51], [66] and satisfiability modulo
first case study describes how the cooperation framework is theories (SMT) solvers [67], [68]. A custom implementation
applied to the system introduced in Section III-C. The second of the z3 SMT solver was used to encode soft constraints
case study shows how the DA-PA can be used to improve and find paths in the DA-PA’s decision making model. A
the flexibility of a larger manufacturing environment. The detailed overview and description of the solver implementation
scalability study analyzes how the developed optimization is found in [63]. For this implementation, the dynamics and
methodology can be scaled to larger manufacturing systems. constraints of the decision making model were transformed
The case studies provided in this section overview how into first-order logic expressions. The solver output was the
the system controlled by a multi-agent system responds to a optimal path for the DA-PA and a set of constraint violations
new, unexpected high priority order. The benefit of using DA- used in the cooperation framework. JavaScript Object Notation
PAs for this specific unexpected disturbance is then discussed. (JSON) was as an interface between the Repast Simphony
Note that it is possible for the DA-PA to respond to other simulation and the z3 SMT solver.
unexpected disturbances and faults in the system, as detailed
in the supplementary material [64]. B. Case Study 1: Small Manufacturing System
A simulation of the manufacturing system introduced in
A. Simulation setup Section III-C is used to analyze the behaviour of DA-PAs in a
The Repast Simphony agent-based simulation platform [65] small manufacturing system environment. The system contains
was used to test and analyze the behavior of a DA-PA in a two robots (R1 and R2), two machines (M1 and M2), and two
manufacturing environment. This agent-based simulation soft- buffers (B1 and B2). New parts entering the system start at
ware has been previously used to test agent-based controllers buffer 1 (B1) and, once the part enters the system, a new DA-
for manufacturing systems [13], [34]. Repast Simphony is a PA is initialized. During initialization, the DA-PA is provided
discrete-time simulation platform, as the simulation is updated the process plan and deadlines, the exit plan, the performance
every time step (also known as every tick). weights, and the priority for the physical properties in the
The developed manufacturing simulation contains high- process plan.
level discrete-event dynamics for material handling robots and In this example, the process plan contains one physical
machines used for manufacturing. The proposed cooperation property: complete process 1 (P1). The exit plan is an agent
framework extended the simulation of the model-based PAs that can take the part associated with the DA-PA out of the
described in [13]. In this updated simulation, each PA builds system. For this example, the DA-PAs attempts to minimize
and uses a finite state machine (FSM) model of a local both the time and energy expenditure in the system with
neighborhood of the manufacturing system to make decisions. αt = 1 and αe = 0.5, respectively. To test the DA-PA
The PAs build the FSM by exploring the manufacturing system cooperation framework, various priority values and deadlines
through queries to the RAs in the system, as described in [14]. were provided to the DA-PAs during initialization, as detailed
Each RA provides an FSM that corresponds to its individual in the following examples.
capabilities and the DA-PA combines this information into a DA-PAs use the exploration methodology described in Sec-
PTA that represents the capabilities of the local manufacturing tion V-A to send requests to RAs in the system to start the co-
environment [13], [14]. A custom PTA class that extends the operation process described in Section IV. For example, a PTA
13
TABLE V: Process time limits for case study 2: semiconductor

manufacturing.
Manufacturing Process Time (ticks per lot)
Process 1 (Diffusion) 150
Process 2 (Ion Implementation) 60
Process 3 (Lithography) 110
Process 4 (Ion Implementation) 100
Process 5 (Diffusion) 170
Process 6 (Lithography) 20
2
an importance value of 2. The optimal path for agpa does not
2
have any constraint violations. Thus, agpa schedules resource
actions to take the associated part to the slower machine, M2,
1
to complete its process plan. The flow times for the agpa and
Fig. 5: Set-up for case study 2: semiconductor manufacturing. 2
agpa to complete their process plans are 141 and 197 ticks,
that is created by a DA-PA for a part in the system without any respectively.
work-in-progress is shown in Figure 2 and described in detail 2) DA-PA to DA-PA cooperation with order prioritization:
in Section III-C. The time limits for the resource actions in the For this example, the manufacturer prioritizes the completion
2
simulation are shown in Table IV. The time limits in Table IV of the part associated with agpa over other parts in the system.
1 2
are used to build the constraints in the decision making model Both agpa and agpa are set-up in the same way as in the
2
shown in Table II from Section III-C. Note that for robot 1, previous example. However, agpa is given an importance value
1
the energy expenditure is much greater if the soft constraint is of 1, making it higher priority than agpa . As in the previous
1
violated, i.e. if robot 1 has to move the part to B2 faster than example, agpa initially chooses to go to M1 to complete the
1
15 ticks. Thus, cnm = 10 kWh and csf t = 100 kwH are the process plan. After 5 ticks (when the part associated with agpa
nrg
costs for staying at state x2 , Knm (x2 ) = cnm · val(cxl 2 ) and is using Robot 1), the more important part enters the system
nrg x2 2
Ksf t (x2 ) = csf t · val(γl ). To conform with the simulation and agpa is initialized. By following the proposed cooperation
2 1 1
environment, we require the part to stay in every state for at framework, agpa requests agpa to reschedule its plan. agpa
least one tick. Therefore, an additional constraint, val(cxl ) ≥ 1, authorizes the request after finding a feasible path that takes
is added for each state in the decision making model, x ∈ X. the associated part to M2 to complete P1. In this scenario,
1 2
The following three examples illustrate the benefits of direct, the flow times for the agpa and agpa are 221 and 145 ticks,
active cooperation. respectively. Compared to the first example, the process plan
2
1) Non-cooperative PAs: This example illustrates the sce- of agpa was completed faster, even though the total flow time
nario when two DA-PAs do not need to cooperate with each of the two parts is longer than the first example. Note that, even
other to complete their individual goals and satisfy the system with the increased flow time, both agents still accomplished
objectives. In this example, a part comes into the system with their process plan successfully.
1 3) DA-PA to DA-PA cooperation to meet deadlines: Both
no deadline provided for P1. A DA-PA, agpa , is created with
1 2
an importance value of 2. The DA-builds the decision making agpa and agpa are set-up in the same way as Example 1.
model shown in Figure 2 and calculates its optimal path by Unlike example 2, the priority of both parts is the same for this
2
solving (7). The solver described in the previous section is example. However, agpa is given a deadline, or hard constraint,
1
used to find a cost-optimal path in the system. Then, after of 180 ticks to complete P1. As in examples 1 and 2, agpa
1 initially chooses to go to M1 to complete the process plan.
solving the optimization problem, agpa schedules the path
2
that consists of delay and discrete transitions to complete However, when the part associated with agpa comes into the
2 1
P1 at M1. Examples of delay transitions that a DA-PA can system, agpa requests the utilization of M1 from agpa . This
1
request include “stay at B1 for 1 tick” from buffer 1 agent, request is authorized by agpa after it finds another feasible
“stay moving to B2 for 15 ticks“ from the robot 1 agent, path in the system (complete P1 at M2). The flow times for
1 2
or “work on process 1 for 100 ticks” from the machine 1 agpa and agpa are 221 and 145 ticks, respectively. The process
agent. Examples of discrete transitions that a DA-PA can plans for both DA-PAs, including the deadline constraint of
2
request include “start moving to B2/M1/M2“ from the robot 180 ticks for agpa , are accomplished successfully due to the
agents or “start p1 at M1/M2” from the machine agents. These cooperation between the two DA-PAs.
discrete, delay transitions allow the DA-PA to communicate 4) Example 4 - DA-PA to RA cooperation to meet a dead-
with RAs, enabling the DA-PA to make dynamic decisions line: Only one DA-PA is considered for this example. In this
for the associated part and accomplish the desired objectives example, a part comes into the system and DA-PA is initialized
1
in the manufacturing system. with a deadline of 135 ticks. To accomplish this task, agpa
A second part comes into the system 10 ticks after the first cooperates with the robot 1 (R1) RA and requests to move to
part. This is an unexpected disturbance for the system that B2 fast (in 5 ticks), violating R1’s soft constraint. Since this
1 1
was not captured by agpa during its initial model creation will ensure that agpa meets its deadline, the RA confirms this
1
and, therefore, not considered by agpa during path planning. violation at the expense of a higher energy expenditure. In this
2 1
The second DA-PA, agpa , does not have deadlines and has scenario, the flow time for agpa is 135 ticks and the DA-PA
14
is able to successfully accomplish its process plan due to the

cooperation between the DA-PA and the RA.
C. Case Study 2: Semiconductor Manufacturing Facility

A simulation of a semiconductor manufacturing facility was
used to study the behavior of direct and active cooperation in
a larger manufacturing system. The set-up for the simulation
is shown in Figure 5. This simulated system contains 20
manufacturing stations that perform different manufacturing
operations (diffusion, ion implementation, and lithography).
The process times for each of these manufacturing operations
are shown in Table V. Each machine has a local buffer where
parts stays before starting a requested operation and after the
operation is finished.
These stations are connected by a network of material Fig. 6: An illustration of the initial PTA created by the DA-PA
handling robots that move products, known as lots, around associated with the first part in Order A. This PTA is created
the system. The robots take 10-20 ticks to move the lots by exploring the system through queries to RAs for process 1
between the various buffers, depending on the distance in and combining the information provided through these queries,
the simulation. The process plan for the lots is provided as described in [14].
in Figure 5. This simulation has been previously used to this desired path, as described in Section IV. This path contains
test the behavior of multi-agent controllers for manufacturing discrete and delay transitions that represent the various buffer-
systems [13], [55]. ing, handling, and processing operations in the manufacturing
A customer requests an order (order A) of 10 lots with the system, e.g., “hold at enter buffer” is a delay transition, “start
process plan shown in Figure 5. To complete this order, 10 moving to bay 1” is a discrete transition, “start the diffusion
parts, each with their own DA-PA, enter the system with the operation“ is a discrete transition. These DA-PAs continue to
same priority every 50 ticks. A higher priority order (order B) build PTAs and use these PTAs until all of the desired physical
of 10 lots with the same process plan is requested 500 ticks processes have been completed.
after the first order. Similar to the first order, 10 new parts During the simulation, multiple DA-PAs will sometimes
controlled by 10 DA-PAs enter into the system every 50 ticks. request the utilization of the same resource because they
The order B DA-PAs have a higher importance value than are performing the scheduling task at the same time. In the
the order A DA-PAs. Neither order has a specific deadline, simulation, the RA that is associated with the resource agent
but the manufacturer would prefer to complete order B parts will utilize a first-come first-serve strategy toward accepting
faster than order A parts. The introduction of new parts and these requests. The RA will then reject all of the other DA-PA
their DA-PAs into the system is an unexpected disturbance for requests. If a DA-PA is rejected, it will re-explore the system,
the work-in-progress parts as the DA-PAs for these work-in- re-create the environment model, and proceed with the steps in
progress parts did not capture the presence of new parts during the cooperation framework. Note that the rejected DA-PA will
model creation and path planning. The number of allowable end up either coordinating with the DA-PA that scheduled the
constraint violations was set to 2, i.e., ζ = 2, when the DA-PA initial resource or finding a new resource to utilize. If there are
solved the optimization problem in (7). multiple unsuccessful cooperation requests, the DA-PA will
Similar to the previous case study, DA-PAs use the explo- call the exit plan, as described in Section III-A.
ration methodology described in Section V-A to send requests The process times shown in Table V are deterministic and
to RAs in the system to start the cooperation process described represent the maximum possible time that the machines will
in Section IV. For example, the PTA that is created by the take to finish the operation, i.e., the operation is guaranteed
first part from Order A that required process 1 is illustrated to be finished in the number of ticks in Table V. These time
in Figure 6. Initially, the part is at the Enter buffer and the limits are provided by the RAs to the PAs as the limits for hard
PTA contains states that represent buffering, moving, and bound constraints. If the machine finishes the operation before
processing operations. The marked states represent how the the time limit, the part will be moved to the local machine
DA-PA can accomplish process 1 (diffusion) in the local buffer and the machine remains idle or accepts another job
manufacturing environment. Each of these states contains an if one is requested. Future work will look to encode more
invariant that represents the time required to stay at state x, as stochastic uncertainties in the PTA constraints and test these
a function of val(cxl ). As new parts come into the system and uncertainties in this simulation. While the processing times are
DA-PAs complete their desired processes, the PTAs that are deterministic, there is some stochasticity in the simulation due
built by new DA-PAs will contain more interval gap constraints to the choices made by the DA-PAs. Sometimes, the DA-PAs
that represent the other DA-PA schedules in the system. will have multiple optimal paths in the PTA, i.e., there are
The DA-PAs use the PTAs from model creation to make several paths that take the same amount of time to reach a
decisions in the manufacturing system. After finding the desired state. For this case, the DA-PAs randomly select one
desired path on the PTA, the DA-PA schedules and executes of these paths when making their decision.
15
Fig. 9: Computational scale study for the optimization problem

in the case study using the PTA model in Fig. 6. Mean and
one standard deviation over ten repetitions are shown.
the flow time in the manufacturing system by 8.5%.
In this case study, order A DA-PAs always accepted an order
B DA-PA violation request. Thus, if an order B DA-PA found
Fig. 7: Distribution of order A and order B average flow times
multiple optimal paths with different constraint violations,
for the semiconductor manufacturing case study.
order B DA-PAs would randomly choose one of the paths
and request the corresponding violation. Hence, the order A
DA-PAs would sometimes have to significantly change their
schedules due to these requests. Therefore, (1) including hard
deadlines for the DA-PAs and/or (2) accurate estimation of
soft constraint violation costs could reduce the variability of
the flow times and decrease average flow times for order A
DA-PAs. Note that even with this behavior, all of the DA-
PAs successfully accomplished their process plan since neither
order A nor order B DA-PAs had any process plan deadlines.
D. Scalability Study
We perform a computational study to illustrate the scala-
bility of the proposed optimization approach for solving (7).
Fig. 8: Distribution of individual part flow times for order B In this study, we utilize the PTA shown in Figure 6, which
in the semiconductor manufacturing case study. is created by the DA-PA associated with the first part in
Twenty five trials were conducted to test the effect of direct Order A. We study the case with a fixed number of RAs and
and active PA to PA cooperation. Each set of five trials had an increasing number of DA-PAs in the system. To simulate
a different number of order B DA-PAs that would request the effect of increasing DA-PAs, we implement an increasing
violations from order A DA-PAs. The number of order B DA- number of interval gap constraints (see (2)) and solve the
PAs that request violations was 0, 2, 4, 6, 8, and 10 for each optimization problem outlined in (7).
set of five trials, respectively. The results of the scaling study are presented in Figure 9.
For every trial, the average flow time of the 10 parts of The simulation runs are repeated 10 times for each number
order A and order B was calculated. Figure 7 shows the of constraints shown on the horizontal axis of Figure 9. All
distribution for the average flow time of orders A and B based simulations are performed on a personal laptop computer with
on the number of cooperative products in order B. Each data Intel Core i7-7700HQ CPU. The longest path of the PTA
point in Figure 7 represents the minimum, maximum, and in Figure 6 includes 14 discrete transitions and a horizon
mean of the average flow time for 5 trials. Note that as more length corresponding to 16 discrete transitions is used in all
cooperation is allowed for order B DA-PAs, the average flow simulations. The horizon of 16 discrete transitions corresponds
time of these parts decreases. However, as order B DA-PAs to N = 32 for (7). The nominal optimization problem without
utilize the resources required by order A DA-PAs, the average the injected interval gap constraints has 52 bound constraints
flow time of order A parts increases. Thus, direct and active in total, which takes 0.73 ± 0.22 seconds to compute. The
cooperation will allow higher priority parts to be completed results show that the number of constraints has a pronounced
quicker, but slow down the production of lower priority parts. effect on the computation time only after approximately 5000
Figure 8 shows the resulting flow times for order B parts for interval gap constraints, which results in over 3 seconds of
all of the trials. Note that Figure 8 represents the distribution computation time to solve (7). The computation time remains
of flow time for individual order B parts, while Figure 7 well below 30 seconds for more than 105 constraints. Each
represents the distribution of average flow times for orders DA-PA typically requests 10-50 constraints when scheduling
A and B in the system. On average, order B DA-PAs with events in the semiconductor manufacturing case study de-
cooperation capabilities had a flow time of 1238 ticks. Order scribed in Section V-C. Therefore, 105 interval constraints
B DA-PAs that did not cooperate had a flow time of 1354 would correspond to somewhere between 103 to 104 DA-
ticks. Thus, the DA-PAs that were able to cooperate reduced PAs in the system. At these scales, both communication
16
and physical bottlenecks may be more pronounced than the [5] A. M. Farid and L. Ribeiro, “An Axiomatic Design of a Multiagent
computational bottleneck caused by solving (7). Reconfigurable Mechatronic System Architecture,” IEEE Transactions
on Industrial Informatics, vol. 11, no. 5, pp. 1142–1155, 2015.
[6] N. K. C. C. Krothapalli and A. V. V. Deshmukh, “Design of negotiation
protocols for multi-agent manufacturing systems,” International Journal
VI. C ONCLUSIONS AND F UTURE W ORK of Production Research, vol. 37, no. 7, pp. 1601–1624, 1999.
[7] T. N. Wong, C. W. Leung, K. L. Mak, and R. Y. Fung, “Dynamic
Multi-agent control has been proposed to improve the flexi- shopfloor scheduling in multi-agent manufacturing systems,” Expert
bility of manufacturing systems. One of the primary agents in Systems with Applications, vol. 31, no. 3, pp. 486–494, 2006.
this control strategy is the product agent. The product agent [8] P. Leitao, A. W. Colombo, and F. Restivo, “An Approach to the Formal
Specification of Holonic Control Systems,” in First International Con-
must cooperate with other product agents and resource agents ference on Industrial Applications of Holonic and Multi-Agent Systems,
in the system when making scheduling and planning decisions. 2003, pp. 59–70.
However, existing product agent cooperation strategies have [9] J. Barbosa, P. Leitão, E. Adam, and D. Trentesaux, “Dynamic self-
organization in holonic multi-agent manufacturing systems: The ADA-
been primarily passive, reducing the flexibility and adaptability COR evolution,” Computers in Industry, vol. 66, pp. 99–111, 2015.
of the product agent and, in turn, of this control strategy. [10] C. Legat, D. Schütz, and B. Vogel-Heuser, “Automatic generation of
A model-based decision making framework to enable direct, field control strategies for supporting (re-)engineering of manufacturing
systems,” Journal of Intelligent Manufacturing, vol. 25, no. 5, pp. 1101–
actively cooperative product agents is introduced. For this 1111, 2014.
framework, a new environment model that leverages the priced [11] M. K. Lim, Z. Zhang, and W. Goh, “An iterative agent bidding
timed automata modeling formalism is proposed. This model mechanism for responsive manufacturing,” Engineering Applications of
Artificial Intelligence, vol. 22, no. 7, pp. 1068–1079, 2009.
explicitly represents the scheduling constraints in the system [12] S. Rehberger, L. Spreiter, and B. Vogel-Heuser, “An agent-based ap-
and separates the desirable, but flexible, time limits (soft proach for dependable planning of production sequences in automated
constraints) from the physical/safety time limits (hard con- production systems,” At-Automatisierungstechnik, vol. 65, no. 11, pp.
766–778, 2017.
straints). An optimization problem is formulated that allows [13] I. Kovalenko, D. Tilbury, and K. Barton, “The model-based product
the direct and active product agent to identify any conflicting agent: A control oriented architecture for intelligent products in multi-
scheduling constraints. A strategy for product agent coordina- agent manufacturing systems,” Control Engineering Practice, vol. 86,
no. March, pp. 105–117, may 2019.
tion and negotiation with other agents in the system is also [14] I. Kovalenko, D. Ryashentseva, B. Vogel-Heuser, D. Tilbury, and
presented. This strategy resolves the conflicts in the system, K. Barton, “Dynamic Resource Task Negotiation to Enable Product
allowing the agents to effectively control and coordinate the Agent Exploration in Multi-Agent Manufacturing Systems,” Robotics
and Automation Letters, vol. 4, no. 3, pp. 2854–2861, jun 2019.
components on the factory floor. This framework is tested in [15] K. Barton, F. Maturana, and D. Tilbury, “Closing the Loop in IoT-
a simulated manufacturing environment, showing how direct, enabled Manufacturing Systems: Challenges and Opportunities,” in
active cooperation can improve the flexibility and performance ACC. IEEE, 2018, pp. 5503–5509.
[16] F. Lopez, Y. Shao, Z. M. Mao, J. Moyne, K. Barton, and D. Tilbury,
of manufacturing systems. “A software-defined framework for the integrated management of smart
Future work will include the following: developing and manufacturing systems,” Manufacturing Letters, vol. 15, pp. 18–21,
incorporating new process plan formalisms to capture a larger 2018.
[17] J. Ota, “Multi-agent robot systems as distributed autonomous systems,”
variety of production orders; creating methods to effectively Advanced engineering informatics, vol. 20, no. 1, pp. 59–70, 2006.
estimate soft constraint violations costs for product and re- [18] J. A. Fax and R. M. Murray, “Information flow and cooperative control
source agents; developing algorithms for agent violation de- of vehicle formations,” IEEE transactions on automatic control, vol. 49,
no. 9, pp. 1465–1476, 2004.
liberation after a violation request; and integrating further [19] S. D. J. McArthur, E. M. Davidson, V. M. Catterson, A. L. Dimeas,
time-based specifications in the product agent process plan. N. D. Hatziargyriou, F. Ponci, and T. Funabashi, “Multi-Agent Systems
Further simulation testing of the proposed methodology will for Power Engineering Applications-Part I: Concepts, Approaches, and
Technical Challenges,” IEEE Transactions on Power Systems, vol. 22,
include incorporating hard constraints for larger manufacturing no. 4, pp. 1743–1752, 2007.
systems, varying the number of allowable constraint violations, [20] A. Kantamneni, L. E. Brown, G. Parker, and W. W. Weaver, “Survey of
and increasing the number of different priority orders in the multi-agent systems for microgrid control,” Engineering applications of
artificial intelligence, vol. 45, pp. 192–203, 2015.
system. The direct, actively cooperating product agent will [21] J. Du, V. Sugumaran, and B. Gao, “Rfid and multi-agent based architec-
also be integrated into a physical manufacturing testbed that ture for information sharing in prefabricated component supply chain,”
run agent-based controllers to test the proposed framework IEEE Access, vol. 5, pp. 4132–4139, 2017.
[22] F. Hsieh, “Dynamic configuration and collaborative scheduling in supply
with real industrial hardware and software. chains based on scalable multi-agent architecture,” Journal of Industrial
Engineering International, vol. 15, pp. 249–269, 2019.
[23] N. Mishra, A. Singh, S. Kumari, K. Govindan, and S. I. Ali, “Cloud-
R EFERENCES based multi-agent architecture for effective planning and scheduling
of distributed manufacturing,” International Journal of Production Re-
[1] D. M. Tilbury, “Cyber-Physical Manufacturing Systems,” Annual Review search, vol. 54, no. 23, pp. 7115–7128, 2016.
of Control, Robotics, and Autonomous Systems, vol. 2, no. 1, 2019. [24] P. Leitao, S. Karnouskos, L. Ribeiro, J. Lee, T. Strasser, and A. W.
[2] P. Leitão, “Agent-based distributed manufacturing control: A state- Colombo, “Smart agents in industrial cyber–physical systems,” Proceed-
of-the-art survey,” Engineering Applications of Artificial Intelligence, ings of the IEEE, vol. 104, no. 5, pp. 1086–1101, 2016.
vol. 22, no. 7, pp. 979–991, 2009. [25] W. Shen, L. Wang, and Q. Hao, “Agent-based distributed manufactur-
[3] B. Vogel-Heuser, J. Lee, and P. Leitão, “Agents enabling cyber-physical ing process planning and scheduling: A state-of-the-art survey,” IEEE
production systems,” At-Automatisierungstechnik, vol. 63, no. 10, pp. Transactions on Systems, Man and Cybernetics Part C: Applications and
777–789, 2015. Reviews, vol. 36, no. 4, pp. 563–577, 2006.
[4] P. Vrba, P. Tichý, V. Mařı́k, K. H. Hall, R. J. Staron, F. P. Maturana, [26] P. Leitão, V. Mařı́k, and P. Vrba, “Past, present, and future of industrial
and P. Kadera, “Rockwell automation’s holonic and multiagent control agent applications,” IEEE Transactions on Industrial Informatics, vol. 9,
systems compendium,” IEEE Transactions on Systems, Man and Cy- no. 4, pp. 2360–2372, 2012.
bernetics Part C: Applications and Reviews, vol. 41, no. 1, pp. 14–30, [27] W. Lepuschitz, A. Zoitl, M. Vallee, and M. Merdan, “Toward self-
2010. reconfiguration of manufacturing systems using automation agents,”
17
IEEE Transactions on Systems, Man and Cybernetics Part C: Appli- Journal of Manufacturing Technology and Management, vol. 8, no. 1-3,
cations and Reviews, vol. 41, no. 1, pp. 52–69, 2011. pp. 37–57, 2006.
[28] D. McFarlane, V. Giannikas, A. C. Y. Wong, and M. Harrison, “Product [50] K. Larsen, G. Behrmann, E. Brinksma, A. Fehnker, T. Hune, P. Pet-
intelligence in industrial control: Theory and practice,” Annual Reviews tersson, and J. Romijn, “As cheap as possible: Efficient cost-optimal
in Control, vol. 37, no. 1, pp. 69–88, 2013. reachability for priced timed automata,” Lecture Notes in Computer
[29] W. Dai, H. Nishi, V. Vyatkin, V. Huang, Y. Shi, and X. Guan, “Industrial Science (including subseries Lecture Notes in Artificial Intelligence and
edge computing: Enabling embedded intelligence,” IEEE Industrial Lecture Notes in Bioinformatics), vol. 2102, pp. 493–505, 2001.
Electronics Magazine, vol. 13, no. 4, pp. 48–56, 2019. [51] G. Behrmann, K. G. Larsen, and J. I. Rasmussen, “Priced timed
[30] L. Panait and S. Luke, “Cooperative Multi-Agent Learning: The State of automata: Algorithms and applications,” in International Symposium on
the Art,” Autonomous Agents and Multi-Agent Systems, vol. 3, no. 11, Formal Methods for Components and Objects. Springer, 2004, pp.
pp. 387–434, 2005. 162–182.
[31] M. Owliya, M. Saadat, G. G. Jules, M. Goharian, and R. Anane, [52] D. L. Pepyne and C. G. Cassandras, “Control of hybrid systems in
“Agent-based interaction protocols and topologies for manufacturing manufacturing,” Proceedings of the IEEE, vol. 88, no. 7, pp. 1108–1122,
task allocation,” IEEE Transactions on Systems, Man, and Cybernetics: 2000.
Systems, vol. 43, no. 1, pp. 38–52, 2012. [53] F. Ocker, I. Kovalenko, K. Barton, D. Tilbury, and B. Vogel-Heuser,
[32] M. M. Polycarpou, Y. Yang, and K. M. Passino, “Cooperative control “A Framework for Automatic Initialization of Multi-Agent Production
of distributed multi-agent systems,” IEEE Control Systems Magazine, Systems Using Semantic Web Technologies,” IEEE Robotics and Au-
vol. 21, no. June 2001, pp. 1–27, 2001. tomation Letters, vol. 4, no. 4, pp. 4330–4337, jul 2019.
[33] J. Leitner, “Multi-robot cooperation in space: A survey,” Proceedings - [54] G. Zheng, I. Kovalenko, K. Barton, and D. Tilbury, “Integrating Hu-
2009 Advanced Technologies for Enhanced Quality of Life, AT-EQUAL man Operators into Agent-based Manufacturing Systems: A Table-top
2009, pp. 144–151, 2009. Demonstration,” Procedia manufacturing, vol. 17, pp. 326–333, 2018.
[34] G. Jules and M. Saadat, “Agent cooperation mechanism for decentralized [55] I. Kovalenko, D. Tilbury, and K. Barton, “Priced Timed Automata
manufacturing scheduling,” IEEE Transactions on Systems, Man, and Models for Control of Intelligent Product Agents in Manufacturing
Cybernetics: Systems, vol. 47, no. 12, pp. 3351–3362, 2016. Systems,” in 15th Workshop on Discrete Event Systems (WODES 2020),
[35] S. Wang, J. Wan, D. Zhang, D. Li, and C. Zhang, “Towards smart factory 2020.
for industry 4.0: a self-organized multi-agent system with big data based [56] C. J. Myers and T.-Y. Meng, “Synthesis of timed asynchronous circuits,”
feedback and coordination,” Computer networks, vol. 101, pp. 158–168, IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
2016. vol. 1, no. 2, pp. 106–119, 1993.
[36] Y. Sallez, T. Berger, and D. Trentesaux, “A stigmergic approach for [57] B. R. Ferrer, B. Ahmad, A. Lobov, D. A. Vera, J. L. M. Lastra,
dynamic routing of active products in FMS,” Computers in Industry, and R. Harrison, “An approach for knowledge-driven product, process
vol. 60, no. 3, pp. 204–216, 2009. and resource mappings for assembly automation,” IEEE International
Conference on Automation Science and Engineering, vol. 2015-Octob,
[37] K. Hadeli, P. Valckenaers, M. Kollingbaum, and H. Van Brussel, “Multi-
pp. 1104–1109, 2015.
agent coordination and control using stigmergy,” Computers in Industry,
[58] A. Bemporad and M. Morari, “Control of systems integrating logic,
vol. 53, no. 1, pp. 75–96, 2004.
dynamics, and constraints,” Automatica, vol. 35, no. 3, pp. 407–427,
[38] C. Hofmann, C. Krahe, N. Stricker, and G. Lanza, “Autonomous
1999.
production control for matrix production based on deep q-learning,”
[59] J. F. Pétin, D. Gouyon, and G. Morel, “Supervisory synthesis for
Procedia CIRP, vol. 88, pp. 25–30, 2020.
product-driven automation and its application to a flexible assembly
[39] A. Lüder, A. Klostermeyer, J. Peschke, A. Bratoukhine, and T. Sauter,
cell,” Control Engineering Practice, vol. 15, no. 5, pp. 595–614, 2007.
“Distributed automation: PABADIS vs. HMS,” IEEE Transactions on
[60] S. Rehberger, L. Spreiter, and B. Vogel-Heuser, “An Agent Approach to
Industrial Informatics, vol. 1, no. 1, pp. 31–38, 2005.
Flexible Automated Production Systems Based on Discrete and Contin-
[40] M. G. Marchetta, F. Mayer, and R. Q. Forradellas, “A reference uous Reasoning,” 2016 IEEE International Conference on Automation
framework following a proactive approach for Product Lifecycle Man- Science and Engineering (CASE), pp. 1249–1256, 2016.
agement,” Computers in Industry, vol. 62, no. 7, pp. 672–683, 2011. [61] A. Liu, H. Liu, B. Yao, W. Xu, and M. Yang, “Energy consumption
[41] H. Tang, D. Li, S. Wang, and Z. Dong, “CASOA: An Architecture for modeling of industrial robot based on simulated power data and param-
Agent-Based Manufacturing System in the Context of Industry 4.0,” eter identification,” Advances in Mechanical Engineering, vol. 10, no. 5,
IEEE Access, vol. 6, pp. 12 746–12 754, 2017. p. 1687814018773852, 2018.
[42] P. G. Ansola, A. Garcı́a, J. de las Morenas, and J. de las Morenas, [62] J. Lv, R. Tang, W. Tang, Y. Liu, Y. Zhang, and S. Jia, “An investigation
“Aligning Decision Support with Shop Floor Operations: A Proposal into reducing the spindle acceleration energy consumption of machine
of Intelligent Product Based on BDI Physical Agents,” in Service tools,” Journal of Cleaner Production, vol. 143, pp. 794–803, 2017.
Orientation in Holonic and Multi-agent Manufacturing, T. Borangiu, [63] E. C. Balta, I. Kovalenko, I. A. Spiegel, D. M. Tilbury, and K. Barton,
A. Thomas, and D. Trentesaux, Eds. Cham: Springer International “Model predictive control of priced timed automata encoded with first-
Publishing, 2015, pp. 91–100. order logic,” IEEE Transactions on Control Systems Technology, 2021.
[43] J. De Las Morenas, A. Garcia-Higuera, and P. Garcia-Ansola, “Shop [64] I. Kovalenko, E. C. Balta, D. M. Tilbury, and K. Barton,
Floor Control: A Physical Agents Approach for PLC-Controlled Sys- “Details About Agent Response in “Cooperative Product Agents to
tems,” IEEE Transactions on Industrial Informatics, vol. 13, no. 5, pp. Improve Manufacturing System Flexibility: A Model-Based Decision
2417–2427, 2017. Framework”,” University of Michigan Barton Research Group,
[44] H. Kaindl, M. Vallée, and E. Arnautovic, “Self-representation for self- Tech. Rep., 2021. [Online]. Available: https://brg.engin.umich.edu/wp-
configuration and monitoring in agent-based flexible automation sys- content/uploads/sites/131/2022/02/DA-PA-agentResponse.pdf
tems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, [65] M. J. North, N. T. Collier, J. Ozik, E. R. Tatara, C. M. Macal, M. Bragen,
vol. 43, no. 1, pp. 164–175, 2012. and P. Sydelko, “Complex adaptive systems modeling with Repast
[45] C. G. Cassandras and S. Lafortune, Introduction to discrete event Simphony,” Complex Adaptive Systems Modeling, vol. 1, no. 1, p. 3,
systems, 2nd ed. Springer Science & Business Media, 2009. 2013.
[46] I. Kovalenko, K. Barton, and D. Tilbury, “Design and Implementation of [66] G. Behrmann, “UPPAAL CORA,” 2014. [Online]. Available:
an Intelligent Product Agent Architecture in Manufacturing Systems,” in http://people.cs.aau.dk/ adavid/cora/index.html
IEEE Emerging Technology and Factory Automation (ETFA), jan 2017, [67] D. Bhave, S. N. Krishna, and A. Trivedi, “On Nonlinear Prices in Timed
pp. 1–8. Automata,” Electronic Proceedings in Theoretical Computer Science,
[47] E. C. Balta, Y. Lin, K. Barton, D. M. Tilbury, and Z. M. Mao, vol. 232, p. 65, 2016.
“Production as a Service: A Digital Manufacturing Framework for [68] M. Hekmatnejad, G. Pedrielli, and G. Fainekos, “Task scheduling with
Optimizing Utilization,” IEEE Transactions on Automation Science and nonlinear costs using SMT solvers,” IEEE International Conference on
Engineering, pp. 1–11, 2018. Automation Science and Engineering, vol. 2019-August, pp. 183–188,
[48] S. Baer, J. Bakakeu, R. Meyes, and T. Meisen, “Multi-agent reinforce- 2019.
ment learning for job shop scheduling in flexible manufacturing sys-
tems,” in 2019 Second International Conference on Artificial Intelligence
for Industries (AI4I). IEEE, 2019, pp. 22–25.
[49] P. Leitão, A. W. Colombo, and F. Restivo, “A formal specification ap-
proach for holonic control systems: The ADACOR case,” International
18
Ilya Kovalenko (Member, IEEE) received the B.S. Kira Barton (Senior Member, IEEE) received the
degree in mechanical engineering from the Georgia B.Sc. degree in mechanical engineering from the
Institute of Technology, Atlanta, GA, USA, in 2015, University of Colorado at Boulder, in 2001, and
and the M.S. and Ph.D degrees in mechanical engi- the M.Sc. and Ph.D. degrees in mechanical engi-
neering from the University of Michigan, Ann Arbor, neering from the University of Illinois at Urbana-
MI, USA, in 2018 and 2020, respectively. He held Champaign, in 2006 and 2010, respectively. She held
a postdoctoral research position at the University a postdoctoral research position at the University of
of Michigan, from Fall 2020 until Fall 2021. In Illinois, from Fall 2010 until Fall 2011, at which
January 2022, he joined the Department of Mechan- point she joined the Mechanical Engineering De-
ical Engineering and the Department of Industrial partment at the University of Michigan, Ann Arbor.
& Manufacturing Engineering at the Pennsylvania She is currently an Associate Professor with the
State University, where he currently holds the position of Assistant Professor. Department of Mechanical Engineering and a Core Robotics Faculty Member
His research interests include control theory and application, with a focus on at the University of Michigan. She conducts research in modeling, sensing,
multi-agent systems, advanced manufacturing, cyber-physical systems, and and control for applications in advanced manufacturing and robotics, with
robotics. a specialization in cooperative learning and smart manufacturing. She was
selected as a Miller Faculty Scholar from 2018-2021, and was a recipient
of an NSF CAREER Award, in 2014, the 2015 SME Outstanding Young
Manufacturing Engineer Award, the 2015 University of Illinois, Department
of Mechanical Science and Engineering Outstanding Young Alumni Award,
the 2016 University of Michigan, Department of Mechanical Engineering
Department Achievement Award, and the 2017 ASME Dynamic Systems and
Control Young Investigator Award.
Efe C. Balta (Member, IEEE) received the B.S.

degree in Manufacturing Engineering from the Me-
chanical Engineering Faculty of Istanbul Technical
University, Turkey, in 2016, and the M.S. and Ph.D.
degrees in Mechanical Engineering from the Univer-
sity of Michigan, Ann Arbor, in 2018 and 2021, re-
spectively. Since June 2021, he is a postdoctoral re-
searcher at the Automatic Control Laboratory (IfA)
at ETH Zurich, Switzerland. His research interests
include control theory, learning-based control, and
optimization methods with applications to additive
manufacturing, cyber-physical systems, robotics, and advanced manufacturing
processes.
Dawn Tilbury (Fellow Member, IEEE) received

the B.S. degree (summa cum laude) in electrical
engineering from the University of Minnesota, in
1989, and the M.S. and Ph.D. degrees in electrical
engineering and computer sciences from the Uni-
versity of California, Berkeley, in 1992 and 1994,
respectively. In 1995, she joined the Mechanical
Engineering Department, University of Michigan,
Ann Arbor, where she is currently a Professor, with
a joint appointment as a Professor of EECS. In
June of 2017, she became the Assistant Director
for Engineering at the National Science Foundation, while maintaining her
position at the University of Michigan. She has published more than 150
articles in refereed journals and conference proceedings. Her research interests
include control systems, including applications to robotics and manufacturing
systems.

PA Cooperation TASE Submission Final Submission

Uploaded by

Copyright:

Available Formats

PA Cooperation TASE Submission Final Submission

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PA Cooperation TASE Submission Final Submission

Uploaded by

Copyright:

Available Formats

1

Cooperative Product Agents to Improve Manufacturing System

Abstract—Due to the advancements in manufacturing system TABLE I: Nomenclature Table

proach are denoted as direct, active cooperation product agents

Machining Station Buffer State Edge

as the solution sequence, where ξ1 = (x1 , c1 , γ1 ) is the C. Coordination

Once the plan is calculated by a DA-PA and confirmed by

TABLE V: Process time limits for case study 2: semiconductor

is able to successfully accomplish its process plan due to the

C. Case Study 2: Semiconductor Manufacturing Facility

Fig. 9: Computational scale study for the optimization problem

Efe C. Balta (Member, IEEE) received the B.S.

Dawn Tilbury (Fellow Member, IEEE) received

You might also like