PA Cooperation TASE Submission Final Submission
PA Cooperation TASE Submission Final Submission
PA Cooperation TASE Submission Final Submission
multi-agent architectures leverage the local decision-making of requirements and not come into conflict with other agents in
agents to improve the response of the manufacturing system the system. To put existing PA cooperation strategies into
if there are unplanned disturbances in the system [9], [14] or context, the following definitions are provided for direct,
to increase the customization capabilities of the system [13]. indirect, passive, and active cooperation.
Most of these architectures identify the roles and respon- Using the definitions in [30], [31], we refer to direct coop-
sibilities of the different manufacturing system agents and eration between two agents as the utilization of a one-to-one
develop the communication requirements for this control communication link between the agents for the exchange of
strategy. Therefore, as part of these architectures, a number information. Indirect cooperation between two agents implies
of agents have been proposed to represent customer orders, that agents pass information and cooperate with each other
parts in the manufacturing system, resources (e.g. robots through other agents or through another controller. Using the
and machines) on the shop floor, and other components [4]. general agent cooperation descriptions found in [32], [33],
However, for real-time system level control during production, we define that a passively cooperating agent can only request
a large majority of the architectures rely on the cooperation actions that do not conflict with decisions from other agents.
and decision making of two types of agents: product agents On the other hand, actively cooperating agents can identify
(PAs) and resource agents (RAs) [1], [4]. conflicting actions and, through negotiation, find resolutions
A resource agent (RA) is a high-level controller for a for these conflicts during decision making.
resource on the shop floor (see [5], [9], [27] for several Product agent to resource agent (PA-RA) cooperation is
examples of RAs). The goal of the RA is to safely and often direct and passive. PAs directly communicate to the RAs
efficiently schedule and complete logistic or manufacturing in the system. However, during this direct communication, PAs
tasks in the system. The RA captures the capabilities and can only request resource actions that fulfill the scheduling
current status of the associated resource by gathering data constraints provided by the RA. Examples of this type of
from the physical resource on the factory floor. The RA also cooperation can be found in most multi-agent frameworks for
sends high-level commands to the associated physical resource manufacturing systems [5], [8]–[10], [12]–[14], [34], [35].
(e.g. pick up a specific part, start a particular manufacturing Most of the product agent to product agent (PA-PA) coop-
operation, etc.), communicates information about the resource eration is indirect and passive [8], [9], [13], [14], [35]–[38].
to other agents, and cooperates with different agents to achieve In these examples, a PA captures the plans of other PAs in
its goals. the system through information provided by another agent.
A product agent (PA) makes decisions for a single part in For example, when queried by a PA, RAs provide other PA
the manufacturing system [5]. The goal for the product agent schedules as scheduling constraints that cannot be violated.
(PA) is to schedule and request various operations from the These types of multi-agent architectures are usually first-come-
system resources based on customer specifications. The PA first-serve for the PAs in the system. Therefore, there is never
receives information about the status of the physical part and direct communication between PAs in the system and there is
the capabilities of the manufacturing system from the RAs. no negotiation over the scheduling constraints.
Based on the information received from the RAs and a set Direct and active cooperation would allow a PA to negotiate
of production requirements [25], the PA makes a decision to: with other agents when the PA identifies any conflicting events.
(1) obtain the capabilities and plans of other PAs and RAs This type of cooperation provides increased flexibility for the
in the system, (2) plan and schedule future resource actions, PA, allowing the agent to more effectively achieve its goals
or (3) request the execution of a resource action. Figure 1 and react to unexpected disturbances.
shows the general flow of information for product agents,
resource agents, and the factory floor for a multi-agent control C. Model-Based Product Agents
architecture framework. To enable direct and active PA cooperation, it is necessary
Integrating intelligent PAs into manufacturing systems leads to develop and improve the intelligence of the product agent.
to several benefits to customers and manufacturers. From In a number of multi-agent architectures, PAs use rule-based
the perspective of the customer, the integration of intelligent reasoning to make planning and scheduling decisions in a
PAs creates a more customer-centric approach to manufac- manufacturing system [5], [39]–[41]. However, it is difficult
turing [28]. Customers are able to tune the parameters of to scale these rule-based reasoners to more complex manu-
individual PAs to ensure that the produced parts meet all facturing systems, where a PA has to cooperate with various
of their desired specifications. From the perspective of the different PAs and RAs in the system.
manufacturer, intelligent PAs can enable more efficient recon- Recent work has looked at developing model-based reason-
figuration and improve system flexibility as new manufacturing ing to develop the decision making of PAs [12], [13], [38],
system schedules can be developed and implemented faster by [42], [43]. Model-based PAs use a model of the system and
enabling data analysis and control at the product level [29]. optimization techniques to drive the decision making of the
PA [13]. These model-based PAs store and update a model
of the manufacturing environment in their knowledge base
B. Product Agent Cooperation [27], [44]. Using this model of the manufacturing environment
When planning and scheduling future resource actions, a PA and a set of goals provided during initialization, the PAs
must cooperate with both PAs and RAs to find a sequence of autonomously run optimization algorithms to decide which
resource actions to complete its associated part’s production actions to schedule with the RAs in the system.
4
Finite state machines (FSMs) [45] are frequently used to when the edge is traversed, while the states have cost rates,
encode the capabilities of the manufacturing systems in model- and steadily accumulate cost over time.
based PAs [12], [13], [42], [43], [46]. For the FSMs used in
model-based PAs, the states represent potential locations and Definition 1 (Priced Timed Automata). A PTA A is defined
completed manufacturing operations (i.e. new features [47]) as an 8-tuple A = (X, C, Σ, E, I, R, P, x0 ), where
for the part in the manufacturing system. The events in the • X = {x0 , x1 , ..., xnx } is a finite set of states of the PTA
FSMs represent resource actions (e.g. logistic or manufactur- • C = c0 × c1 × ... × cnc = Rn≥0 c
is the clock state space
ing operations) that allow the part to transition between the • Σ = {σ0 , σ1 , ..., σnσ } is a finite set of events
states in the system. • E ⊆ X × B × Σ × X is a finite set of edges
Other models that have been used to encode manufacturing • I : X → B is the invariant operator
system capabilities include Markov Decision Process (MDP)
• R : E × C → C is a reset operator
models [38], [42], [43], [48] and Petri nets [9], [49]. The MDP
• K : X ∪ E → [0, ∞) maps the locations and edges to
models used by PAs are extensions to PA FSM models that
their costs
also capture some of the stochastic elements of manufacturing
systems, e.g., processing time [38], [42], [43], [48]. Petri nets • x0 ∈ X is the initial state
models have also been used for decision making by PAs [9], The guards are embedded in the edge definition, with B
[49]. The tokens in these Petri net models were used to track denoting constraints on the clocks for each edge, i.e., the guard
the availability of raw material and to ensure that another agent conditions. A transition is a formal description of a change in
(a task agent) completes a requested task [49]. Note that it is the system state and the transition type. There are two types
possible to create FSM models that would be equivalent to the of transitions, discrete transitions (changes in location) and
Petri net models used by the PAs in [9], [49]. delay transitions (changes in clock values). Let ty,z denote
All of these models encode the scheduling constraints of the a transition with y ∈ {s, d} denoting if the transition is a
PA as hard constraints, i.e. these constraints cannot be violated discrete (s) or a delay (d) transition, and z = {e, τ } where
at any point by the PA. However, these scheduling constraints ty,e
e ∈ E and τ ∈ R. We notate transitions with x −−→ x′ ,
in a manufacturing system are flexible in certain cases and
where x, x′ ∈ X. We provide further definitions on the clock
can be relaxed through negotiation with other PAs and RAs
structure before formally defining the two types of transitions.
in the system. Current model-based PAs do not capture the
We utilize a two-clock structure in this work. The two-clock
negotiable actions, i.e. soft constraints, in their planning and
state space is given as C = cl × cg through the rest of the
scheduling models.
paper, noting that additional clocks may be added for various
D. Priced Timed Automata applications. The local clock, cl , represents the time spent at
a single state and, thus, is always reset to 0 when entering
This work extends existing model-based PAs for capturing
a state via a transition in the system. The global clock, cg ,
the constraints that can be negotiated by the PAs, i.e., the soft
represents the absolute time for the system and is never reset in
constraints. Since most of the models used by existing PAs
the system. Additionally we use the valuation operator val(·)
are either FSMs or can be stated as FSMs, we will extend the
to denote the value of a clock. The valuation operator captures
FSMs modeling formalism to capture the soft constraints in
the values of clocks for a state x ∈ X at the time when a
the system. Specifically, we propose to leverage the priced
discrete transition corresponding to an edge e ∈ E is taken
timed automaton (PTA) modeling formalism [50], [51] to
from the state x to a new state x′ . For example val(cxg ) denotes
extend existing PA models and capture the PA’s scheduling
the value of the global clock at state x at the time of transition
constraints.
to a new state x′ by taking an edge. Then, the two types of
Here we provide a formal definition of PTA, based on [51].
transitions for PTA are defined as follows.
A PTA is composed of a set of states, X, connected by a set of
ts,e
edges, E. Each edge is labeled with an event, σ ∈ Σ. A PTA Definition 2 (Discrete Transition). A transition x −−→ x′
also has a set of clocks, continuous variables c ∈ C = Rn≥0 c
, is a discrete transition if e = (x, σ, x′ ) ∈ E, the guard is
where nc is the number of clocks in the system. All clocks satisfied ci |= g, and clock valuations are incremented as
′ ′
have the same constant, positive growth rate and an initial val(cxg ) := val(cxg ) and val(cxl ) := 0. Note that we say a
value of zero. discrete transition is “taken” when an edge in the model is
A finite set of Boolean indicator functions of clock values, traversed.
B, represents the constraints. For example, for a clock value
c ∈ C, the Boolean indicator function Ici ≤a (c) ∈ B evaluates td,τ
Definition 3 (Delay Transitions). A transition x −−→ x′ is
to true if and only if the ith element of c is less than or equal to a delay transition if x = x′ , the invariant I(x) is true, and
a. Each element of B is either an invariant for a location, which ′
clock valuations are incremented as val(cxg ) := val(cxg ) + τ
must evaluate to true for the system to occupy the location, or ′
and val(cxl ) := τ .
a guard on an edge, which must evaluate to true to traverse an
edge. Reset maps on edges set clocks to predetermined values Note that by explicitly defining the two transition types with
when the edge is traversed. the specific clock valuation updates, we implicitly defined a
Additionally, a PTA has costs on states and edges. Costs on reset operator, R, for the PTA. We omit R in later definitions
edges are discrete increments added to the total running cost for brevity.
5
III. DA-PA K NOWLEDGE BASE are stored in the environment model of the DA-PA. A formal
This section provides a description of the direct and actively description of their encoding in the environment model is
cooperating product agent’s (DA-PA’s) knowledge base (i.e., provided in Section III-B.
the goals, environment model, and decision making model). A set of performance weights is provided during the DA-
PA’s initialization. These performance weights are computed
offline based on the customer order or the manufacturer
A. Goals requirements. Each performance weight corresponds to a met-
During initialization, a direct, actively cooperative product ric. Formally, the set of performance metrics is defined as
agent (DA-PA) is provided a process plan, exit plan, perfor- {αpi ∈ R≥0 | pi ∈ P M }. The magnitude of the weight
mance weights, and the priority of the associated part with represents the relative importance of the corresponding metric
respect to other parts in the system. for the DA-PA. For example, if a DA-PA favors minimizing
1) Process Plan: The process plan is formulated based on material cost over energy usage, the following relation will
a customer order and consists of an ordered list of desired hold αmaterial > αenergy . The DA-PA uses the performance
physical properties for the associated part with deadlines [25]. weights and metrics during its decision making to identify
This ordered list representation of the process plan captures the desirable resources and resource actions in the system.
precedence constraints of product features and is often used to 4) Agent Priority: The DA-PA is provided an importance
specify the required processes for a part in the manufacturing value, pr ∈ N, which represents the priority of the associated
system [47]. Physical properties include part locations or part physical part when compared to other parts in the system. An
features, e.g. “part at exit loading dock” or “part has hole importance value of 1 signifies that the part is of the highest
with taper” [13], [47], [52]. Formally, the process plan is priority. Therefore, a part i has a higher priority than part j if
defined as: P lan = (P Pd , Dl), where P Pd = (pp1 , pp2 , ...) pri < prj .
is the ordered list of desired physical properties ppi , and Note that while the process plan, performance weights, and
Dl : P Pd → R≥0 is the latest allowable finish time for a agent priority are all part of the DA-PA’s individual goal, they
property. are linked to the overall performance of the manufacturing
The process plan is developed offline and provided to the system. The process plan is created based on customer and
DA-PA (e.g. [53]). Since the plan is computed offline and, manufacturer specification (including order of desired physical
additionally, to ensure that communication between agents is properties and deadline) [25], [53]. The performance weights
kept in a local neighborhood [14], we assume that the DA-PA can be set by the manufacturer to prioritize certain actions
accomplishes the process plan one physical property at a time. for the DA-PA (e.g. use less energy or material). Finally, the
Therefore, the DA-PA will always look to complete the next agent priority can be set by the needs of the manufacturer and
unfinished property in the process plan. Note that the DA- based on the cost-benefit of the customer order. For example,
PA keeps track of the completed physical properties and can if a customer sends in a highly profitable order with harsh
identify the next unfinished physical property when required. penalties for production delays, the DA-PAs associated with
Future work will focus on encoding the process plan in the order would be given a high priority.
other modeling formalisms, such as finite state machines Future work will look at developing effective methods to
and priced timed automata. Encoding the process plan as integrate higher-level controllers, such as a Manufacturing Ex-
an automata will improve the expressiveness of the process ecution System (MES), with DA-PA performance weights and
plan, allow greater product personalization, and enhance the agent priority parameters. Since these high-level controllers
DA-PA decision making flexibility. For example, an automata have access to a global view of the system, they would be able
representation would allow for encoding partial ordering of to supervise the DA-PAs and tune the performance weights
production steps. Consider three processes p1, p2, and p3. and priorities to formally guarantee that the behavior of the
By defining appropriate guards and invariants, we can specify DA-PAs meet the system-level manufacturing requirements.
that both p1 and p2 to be completed before p3, but no strict
specification on the ordering between p1 and p2. B. Environment Model
2) Exit Plan: The DA-PA is initialized with an exit plan The capabilities and scheduling constraints of the local
that is used if the DA-PA cannot find a sequence of resource manufacturing environment are captured by the DA-PA’s envi-
actions to fulfill the process plan [13]. The DA-PA can exit ronment model [13], [55]. The DA-PA updates its environment
the manufacturing system by calling an exit agent in the exit model as other agents (RAs and PAs) provide the capabilities
plan. For example, an exit agent can be a human agent [54] and scheduling constraints, as described in Section IV-A. The
who can retrieve the part from the manufacturing system. The environment model is defined as the following tuple:
decision of the DA-PA to exit the system is discussed in detail AEM = (X, P rp, x0 , Σ, E, C, Γ, I, Knm , Ksf t , Ag):
in Section IV. X = {x1 , x2 , ...} is the set of states for the associated
3) Performance weights: There are various, measurable physical part
performance metrics, P M = {pm1 , pm2 , ..., pmn }, for re- P rp : X → P P maps states to physical properties
sources on the plant floor, e.g. energy and material cost. These x0 ∈ X is the current state of the part
metrics are tracked and stored by the associated RAs in the Σ = {σ0 , σ1 , ...} is a finite set of events, where each event
system. The RAs share these performance metrics when they represents the start or completion of a resource action (e.g.,
communicate their capabilities with the DA-PA. These metrics a logistic or manufacturing operation)
6
E ⊆ Q × Σ × Q is a finite set of edges each state. Therefore, all of the constraints in the model are
C = {cl , cg } are the local and the global clocks stored in the invariant mapping, I. This mapping links a state
x
Γ : {..., γ1xi , γ2xi , ..., γ1 i+n , ...} is the set of slack variables, to all of its constraints.
xi
where γj ∈ R is the j th slack variable for state xi . All time constraints, B, are logic expressions [58]. There are
I : X → B[val(C), val(Γ)] maps states to their constraints two types of constraints used in the environment model: bound
as a function of the clock and slack variables constraint and interval gap constraint. The bound constraint is
Ag : X ∪ E ∪ B[val(C), val(Γ)] → Agents maps states, a time limit that represents an absolute limit when the part can
edges, and constraints to corresponding agents enter a state or leave a state. The gap constraint represents an
Knm : X ×P M ×val(C) → R≥0 maps states, performance interval when the part cannot be at a certain state. Formally,
metrics, and valuations of clocks to nominal cost values the two constraints are defined as follows:
Ksf t : X × P M × val(Γ) → R≥0 maps states, performance
Definition 4 (Bound Constraint). A bound constraint, Bb , for
metrics, and valuations of slacks to cost-of-constraint-
state, x, is satisfied (i.e. evaluates to true) if and only if a
violation values
clock valuation (either local or global) at the state, val(cxb ),
where X, P rp, x0 , E are the discrete event dynamics, C, Γ, I for cb ∈ C satisfies:
encode the time-based state constraints, Ag is the agent
val(cxb ) ▷◁ tb ± aγ val(γbx
association function, and Knm , Ksf t are the state costs for (1)
the DA-PA. The vectors val(C) and val(Γ) are valuations
of the variables C and Γ for each state in the system. where tb ∈ R≥0 is the time limit for the clock, aγ ∈ [0, 1]
These valuations capture the values of the clocks and slack is the hardness coefficient, γbx is a slack variable associated
constraints for a state x ∈ X at the time when a discrete with the constraints on state x, and ▷◁ is one of the following
transition corresponding to an edge e ∈ E is taken from state logical operators: <, ≤, =, ≥, >.
x to another state. To ensure unique clock and slack valuations, Note that a bound constraint is a hard constraint if aγ = 0.
we enforce the AEM to be acyclic, while noting that cyclic
graphs can be “flattened” into acyclic graphs efficiently [56]. Definition 5 (Interval Gap Constraint). An interval gap con-
AEM is an extension of the PTA modeling formalism pre- straint, Bg , for state x and interval (tl , tu ) is satisfied if and
sented in Section II-D. A description of how the manufacturing only if the local and global clock valuations at the state,
environment is mapped to each component is discussed in the val(cxl ) and val(cxg ), satisfy:
rest of this section. Note that there are several differences when Z1 ∨ Z2 (2a)
AEM is compared to the PTA from Definition 1. As previously
Z1 := val(cxg ) ≤ tl + aγ,l val(γ1x ) (2b)
discussed, the reset operator, R, is left out for brevity. Only
states (not edges) of AEM have constraints and these state Z2 := val(cxg ) − val(cxl ) > tu + aγ,u val(γux ) (2c)
constraints are encoded in the invariant, I. Finally, there are where tl , tu ∈ R≥0 and tu > tl , aγl , aγu ∈ [0, 1] are
a number of extensions to the constraints and costs of AEM , the hardness coefficients, γlx , γux are unique slack variables
which are discussed here. associated with the constraint, and ¬, ∧, ∨ are logic operators
1) Discrete event dynamics: X, P rp, x0 , E represent the
NOT, AND, OR, respectively.
capabilities of the manufacturing system. X are the states of
the physical part in the system. Each state is a combination As defined previously in Section III-B, val(cxg ) indicates the
of several physical properties of the part. Physical properties global clock time when a transition is taken from state x to
are locations and physical composition of the part (e.g. “part another state. Z1 is satisfied when the transition from state x
at machine” or “part feature completed”) [52] or operations is taken before the lower bound, tl . Additionally, cxl is reset to
performed on the part (e.g. “moving the part” or “working 0 after each transition (defined in Section II-D). Thus, val(cxl )
on a manufacturing process”). P rp maps each state of the is the time between the transition to state x and the transition
environment model to one or more of these physical properties. from state x, i.e., the amount of time the part is in state x.
A current state, x0 , is updated when an RA informs the DA-PA Therefore, val(cxg ) − val(cxl ) represents the global clock time
that it has started or finished a resource action. Each event, Σ, when a transition is taken to state x from another state. Z2 is
represents an instantaneous (with delay 0) start or completion satisfied when the transition to state x is taken after the upper
of a manufacturing or logistic operation [57]. bound, tu . Z1 ∨ Z2 is satisfied when the transition is taken
2) Time-based Constraints: The state constraints, B, limit from x before the interval or a transition is taken to x after
the amount of time that the associated part can be in the the interval, i.e., the part is not in the state at that interval.
state (e.g. how long the part can stay at a location). These Note that the constraints should encode some of the stochas-
constraints are split into hard and soft constraints, B = tic properties of the system. For example, for manufacturing
H[val(C)] ∪ S[val(C), val(Γ)]. Hard constraints, H[val(C)], operations, the maximum operating time (including any uncer-
cannot be violated by the DA-PA (e.g. minimum time required tainties) can used to create the bound constraint for the system.
to complete a manufacturing operation). Soft constraints, An example is provided in Section III-C.
S[val(C), val(Γ)], contain slack variables, Γ. As described 3) Transitions: The objective of the DA-PA’s decision
in Section IV-C, these soft constraints may be violated by the making is to find clock and slack variable valuations that
DA-PA through negotiation with other agents in the system. satisfy all of the state constraints in the system. To satisfy these
Note that there is no limit for the number of constraints at constraints, the DA-PA must choose a sequence of discrete
7
transitions from E that move the associated part from one state obtaining the nominal cost parameters for each state. While
x to a new one x′ and delay transitions that keep the part at the methods to develop and update the necessary RA models
the same state x but take a finite time τ ∈ R. The definitions are out of the scope of this paper, we envision that existing
of these two transitions were provided in Section II. Detail modeling frameworks, e.g., [14], [60], can be extended to
about how a DA-PA chooses these transitions is provided in calculate Anm , bnm . For the soft constraint violation cost, the
Section IV-B. agent associated with each soft constraint, agsc , provides the
aksf t , bksf t to the DA-PA. aksf t , bksf t represent how undesirable
Remark 1. It is always possible to denote delay and discrete
the constraint violation would be to agsc , in terms of a per-
transitions in alternating sequence since delay transitions are
formance metric k. Currently, the cost of constraint violations
additive and we can define a zero time delay transition between
is determined using feedback from an operator or a subject
two consecutive discrete transitions.
matter expert. Future work will develop methods to effectively
4) Agent association function: The DA-PA builds the en- estimate these soft constraint violation costs for both PAs and
vironment model by requesting capabilities from the RAs in RAs. To ensure desirable performance of other agents, the DA-
the system [10], [13]. If the RAs provide their capabilities as PA will minimize the cost of constraint violations ensuring that
PTA-based models, the DA-PA can put together these models constraints are only violated if necessary (see Section IV-B).
to create the proposed environment model [59]. However, to
enable the proposed cooperation framework, the DA-PA must
C. Decision Making Model
keep track of the agents associated with the components in the
environment model. A decision making model is the following tuple:
The agent association function, Ag, maps states, edges, and ADM = (X, P rp, x0 , Σ, E, C, Γ, I, Ag, Xm , K), where
constraints to an associated agent. The states can be mapped (X, P rp, x0 , Σ, E, C, Γ, I, Ag) are obtained from the envi-
to one or more agents. If a state is mapped to more than one ronment model, Xm ∈ X is a set of marked states, and
agent, it is a shared state [14]. The presence of these shared K : X × val(C) × val(Γ) → R≥0 denotes the state costs.
states allows for the modular composition of the environment The set of marked states, Xm , represents desired states for
model. The edges are mapped to a single agent so that the the DA-PA [45]. A discussion of how the decision making
DA-PA can identify which agent to contact when requesting model is put together is presented in Section IV-A.
a discrete transition. Similarly, constraints are mapped to a Figure 2 shows an example of a decision making model for
single agent, representing which agent is responsible for the a DA-PA in a manufacturing system composed of 2 robots
scheduling constraint in the system. (R1 and R2), 2 machines (M1 and M2), and 2 buffers (B1
5) State costs: The nominal cost, Knm pi
(xj ), for perfor- and B2). There are 4 RAs in the system, one for each robot
mance metric pi is the cost for a part to stay at a state with and machine. The buffers are used by the robots to pick and
respect to the corresponding metric (pi ). Formally, the nominal place parts in the system and do not have an associated agent
cost for performance metric pi for a single state xj is: in this example. B1 is used by R1 and B2 is used by both
R1 and R2. The machines complete a manufacturing process
pi
Knm (xj ) = Anm · val(C x ) + bnm (3) (P1) required as part of the DA-PA’s process plan. This initial
state, x0 , and marked states, Xm are shown in Figure 2b.
where Anm ∈ R2≥0 , bnm ∈ R≥0 , and val(C x ) =
Note that in Figure 2b the initial state represents the current
[val(cxg ), val(cxl )]T are the valuations of the clocks for state x. location of the part associated with DA-PA and the marked
pi
Similarly, the soft constraint violation cost, Ksf t (xj ), is the states represent the next desired process based on the DA-
cost for the part to stay at a state if there is a constraint PA’s process plan. The dashed rectangles outline which states
violation. The soft constraint violation cost for performance and events are associated with a specific agent in the agent
metric pi for a state is: association function, e.g. states x1 , x2 , x3 and events σ1 , σ2
m are associated with the R1 agent.
x x
pi
X
Ksf t (xj ) = aksf t val(γk j ) + bksf t δ(val(γk j )) (4) Table II shows the physical properties, constraints, and costs
k=1 for each of the states in the example decision making model.
where m = |I(xj )| is the number of constraints for the state, All of the constraints (i.e., invariants) in Table II are bound
x
γk j is the slack variable associated with the k th constraint on constraints (see Definition 4). These constraints represent both
state xj , aksf t , bksf t ∈ R≥0 , ∀1 ≤ k ≤ m, and δ(·) denotes (1) desirable, but flexible, operating limits (soft constraints)
an indicator operator with 1 if the argument is nonzero and 0 and (2) physical/safety time-based limits (hard constraints)
otherwise. Note that if aksf t = 0 and bksf t = 0, then there is no for the resources in this example. For example, state x2
penalty for violating constraint k. Similarly, if the associated contains both a soft and a hard constraint. The soft constraint,
valuation of the slack variable is 0, then there is no violation val(cxl ) > tsf t x
R1,B2 − val(γl ), represents the desired operating
x x sf t
cost added, i.e. if val(γk j ) = 0, then δ(val(γk j )) = 0 and time limit, tR1,B2 , for R1 to take the part from B1 to B2.
xj k xj
val(γk ) + bsf t δ(val(γk )) = 0. Thus, the sum (4) is a If necessary, R1 can perform the material handling operation
summation of slack costs for the nonzero slack valuations. faster (i.e., violate the soft constraint) if the value of the slack
For each state, RAs use their own internal data-driven and variable is greater than 0 for that state, val(γlx ) > 0. However,
physics-based models to estimate the nominal cost parameters. as discussed later in this section, there is an energy penalty for
PAs obtain and update Anm , bnm after querying RAs and this constraint operation. In addition to the soft constraint, a
8
M1
B1 R1 B2 R2
M2
Material Handling Robot Robot Trajectory Marked State Initial State Indicator
(a) (b)
Fig. 2: (a) An overview of a system with 2 robots (R1, R2), 2 machines (M1, M2), 2 buffers (B1, B2). (b) A visualization
of the DA-PA’s decision making model for this system. The system has 4 resource agents representing R1, R2, M1, and M2.
R1 moves the part from B1 (x1 ) to B2 (x3 ). R2 moves the part from B2 (x3 ) to M1 (x6 ) or M2 (x9 ). M1 and M2 complete
manufacturing process 1 (P1), represented by x8 and x11 . The part is at B1 and needs P1 completed (x8 or x11 ) due to its
process plan. Note that x2 , x4 , x5 represent states when the part is moving and x7 , x10 represent states when the part is
undergoing a physical change via a manufacturing process. See Tables II and III for more information about states and events.
TABLE II: State properties, invariants, and costs for Fig. 2. TABLE III: Event descriptions for Fig. 2.
State, Properties, Event, (σ) Event Name Event, (σ) Event Name
Invariant, I(x) Cost, K(x)
(x) P rp(x)
Start moving to Finish moving to
σ1 , σ3 , σ5 σ2 , σ4 , σ6
x1 At B1 None αt Kt (x) B2/M1/M2 B1/M1/M2
sf t t Start P1 at Finish P1 at
Moving val(cx x
l ) > tR1,B2 − val(γl ) αt K (x)+ σ7 , σ9 σ8 , σ10
x2 nrg nrg
M1/M2 M1/M2
to B2 val(cxl ) > t nm
R1,B2 α e K nm (x) +Ksf t (x)
x3 / At B2/
None αt Kt (x)
but does not necessarily hold for all parameters and tasks.
x6 /x9 M1/M2 Assuming that the speed of operation and the energy cost are
x4 /
Moving
nrg
inversely proportional for Robot 2, the energy cost for moving
to val(cx nm
l ) > tR2,M 1/M 2 αt Kt (x) + αe Knm (x) to B2 is evaluated to the following: Knm nrg
(x) = cnm · val(cxl )
x5
M1/M2 nrg x
and Ksf t (x) = csf t · val(γl ), where cnm and csf t are energy
x7 / At M1/M2 nrg
val(cx nm
l ) > tM 1/M 2,P 1 αt Kt (x) + αe Knm
e (x)
usage coefficients. The penalty term, Ksf t (x), represents the
x10 P1 Working
extra energy required by the RA to accomplish this operation
x8 / At M1/M2
val(cx
g ) < td αt Kt (x) if it has to operate faster than its desired operating time limit.
x11 P1 Finished
Note that csf t > cnm to ensure that there is a penalty for
hard constraint for state x2 , val(cxl ) > tnm violating the soft constraint. Future work will test the proposed
R1,B2 , represents the
absolute minimum time, tnm , required by R1 to take the approach when there are more complex relationships between
R1,B2
part from B1 to B2. Note that tnm the speed and the energy cost of accomplishing a task, such
R1,B2 is the upper bound of the
time that it takes for R1 to perform this action. In this example, as described in [61], [62].
tnm Table III provides information about the events in the
R1,B2 is a guaranteed time limit for the robot to finish moving
a part from B1 to B2, even with stochastic uncertainties for system. Note that each event is associated with a single agent
the action. The constraints for states x4 , x5 , x7 , and x10 also and the DA-PA has to (1) schedule time at the resource, and
represent similar physical limits for corresponding resource (2) requests these events at a scheduled time. For example, the
operations. Furthermore, there is a hard constraint on x8 and DA-PA can send a request to the M1 agent to schedule the use
x11 , val(cxg ) < td , that represents a customer deadline of time of M1 at a specific time. Then, once the associated part arrives
t − d to complete the manufacturing process. at M1 (state x6 ), the DA-PA can send a request to start the
An overview of the costs is also provided in Table II. The P1 manufacturing process (event σ7 , “Start P1 at M1”) to the
state cost is a combination of two metrics: processing time and machine, thus transition the part to the next state, x7 . Further
energy expenditure. The processing time cost is the amount details on this process are provided in Section IV-D.
of time spent in that state: K t (x) = val(cxl ). The energy
expenditure cost is provided by the RAs and represents the IV. C OOPERATION F RAMEWORK
energy used by the RA for the part to stay at the state. For the The DA-PA uses the goals, environment model, and decision
purpose of simplification, we assume that RAs can save energy making model to find a new path, or sequence of resource ac-
when they take a longer time to accomplish tasks. Note that tions, in the system after receiving an update to its environment
this relationship can be true in specific regions of operation model. An update to the DA-PA’s environment model occurs
for both industrial robots [61] and manufacturing tools [62], either (1) when the DA-PA is provided information about
9
Receive update/request a recommendation or help the part exit the system [13].
from PA or RA
A. Model Creation
Update the environment model The DA-PA goes into the model creation phase after receiv-
Model ing an update for its environment model. In this phase, the
Creation Create decision making model DA-PA updates the environment model to capture the current
capabilities and constraints of the environment. Then, the DA-
PA creates the decision making model used to find the next
Solve optimization problem
sequence of resource actions.
1) Update the environment model: The DA-PA’s initial step
No in the model creation phase is to update the environment
Path found?
model using the information provided to the DA-PA by
Path Planning Yes another PA or RA in the system. Other agents can send
a message to the DA-PA to update any component of the
Soft Constraint No
Violation(s)?
AEM , i.e. (X, P rp, x0 , E, T r, C, Γ, I, Knm , Ksf t , Ag). These
updates represent changes in the manufacturing environment,
Yes e.g. new resources, updated scheduling constraints, changes
to the associated part’s current state, etc. While the commu-
Send constraint violation
request to agent(s)
nication and messaging protocols are out of scope of this
paper, examples of the modular composition of an environment
model can be found in [10], [14], [59].
No All violation(s) 2) Create the decision making model: The DA-PA com-
authorized?
bines AEM and its goals to create the decision making model,
Coordination
Yes ADM , by taking the following steps. To create ADM , the DA-
Send constraint violation PA computes a set of marked states and updates the constraints
confirmation to agent(s) and costs of AEM .
The DA-PA computes the set of marked states, Xm by
Scheduling obtaining the next unfinished physical property, ppd , from
Schedule Path not
its process plan. The DA-PA marks a state (i.e. adds the
path found state to the set of marked states in ADM ) if ppd matches
the physical property of a state in the environment model:
Fig. 3: An overview of the direct, active cooperation frame- ppd ∈ P rp(xi ), ∀xi ∈ Xm . Deadlines are incorporated in
work. The framework includes 4 steps: model creation, path ADM by adding a hard constraint, val(cg ) < Dl(ppd ), to the
planning, coordination, and scheduling. The gray boxes indi- constraints of the marked states, I(xi ), ∀xi ∈ Xm .
cate when the DA-PA communicates with other agents. To account for the priority of other the parts in the system,
the DA-PA updates the soft constraint violation costs from
the environment during initialization (2) another agent con- AEM before incorporating these costs into ADM . Let the
tacts the DA-PA with new information due to an unexpected DA-PA be denoted as ac and its priority as prac . Since
disturbance or (3) when RAs reply to a PA query for new each constraint is mapped to an agent with Ag, the DA-
information about the environment (e.g. [13]). Direct, active PA obtains a constraint’s associated agent and the agent’s
cooperation is used by the DA-PA to find paths in the system priority, ai , prai = Ag(Bi ). Soft constraints with a higher
that allow all of the agents to meet their goals, if possible. priority DA-PA (i.e. prai < prac ) are hardened by setting
Once a path is found, the DA-PA will start to schedule events the corresponding aγ = 0 (see Eq. 1). For soft constraints
with the RAs in the system. If a path is not found, the DA-PA with a lower priority DA-PA (i.e. prai > prac ), the penalty
will contact an RA to exit the system [13]. for constraint violation is removed by setting aksf t = 0 and
After the DA-PA receives an update to the environment bksf t = 0 (see Eq. 4). Soft constraints with an equal priority
model, the DA-PA goes through the 4 steps in the cooper- DA-PA (i.e. prai = prac ) are not changed.
ation framework: (1) model creation, (2) path planning, (3) The state cost, K(xj ), ∀xj ∈ X is a function that depends
coordination, and (4) scheduling. A high-level overview of the on the valuation of soft constraint violations and the clock
cooperation framework and the steps are shown in Figure 3 valuations. The DA-PA computes the state cost by weighing
and described in detail in this section. Once cooperation is the state cost for each metric in AEM , p1 , p2 , ..., pn , with its
finished, the DA-PA executes the plan developed during the performance weights, αp1 , αpi , ..., αpn . Formally, the cost for
scheduling phase through requests to the corresponding RAs. each state is defined as:
Note that if the DA-PA cannot find, schedule, or execute a Xn
pi pi
path, the DA-PA will try to re-build the environment model and K(xj ) = αpi Knm (xj ) + Ksf t (xj ) , (5)
retry each of these steps. If a DA-PA does not find a solution i=1
pi pi
within a certain number of iterations or within a bounded time, where Knm (xj ) and Ksf t (xj ) are defined in Section III-B.
pi
it will take the exit plan described in Section III-A to provide Note that Ksf t (xj ) is a summation of slack costs for nonzero
10
slack valuations. Thus, the state cost is dependant on the soft sequence defined by constraint (7b). Constraint (7c) ensures
constraint violation cost only if there is a constraint violation, that the path is in the language of ADM , ensuring that all
i.e., nonzero slack variable valuation. the constraints in ADM are satisfied and all the transitions
are well-defined. Constraint (7d) defines an upper bound on
B. Path Planning the number of constraint violations so the solution sequence
s∗ can violate at most ζ constraints. Note that we recover
After building the decision making model, the DA-PA solves
the total number of constraint violations in the optimal so-
an optimization problem to find the path on the decision
lution using #(ΠΓ (ϕ(s∗ ))). Constraint (7e) ensures that at
making model. A path on the decision making model PTA
least one of the marked states is in the solution path, i.e.,
is a sequence of transitions that will take the associated part
x ∈ ΠX (ϕ(s∗ )), x ∈ Xm ⊂ ADM . The choice of N in (7)
from its current state to a marked state in the decision making
depends on the computational time requirements of the DA-
model, while satisfying the constraints in the system.
PA. If there are no specific computation time requirements, the
Definition 6 (Path). A path s on a PTA is an ordered set DA-PA can use longest path length of the PTA to guarantee
of discrete and delay transitions in which the post-transition feasibility of the problem (see [63] for different choices of N
state of each transition is equal to the pre-transition state of in a receding horizon setting). In Section V-D, we show that
the subsequent transition. Thus a path can be notated as solving (7) by using the longest path of the PTA for N is
D E justified for a PTA with 50 states and up to 104 constraints.
s = td,τ1 ts,e1 td,τ2 ts,e2 ... td,τn ts,em (6) If the number of constraint violations, #(ΠΓ (ϕ(s∗ ))), is
and the path’s total cost is given by the sum of the costs for zero, then the DA-PA can schedule the obtained sequence of
all discrete and delay transitions in the path. actions without the need to negotiate over conflicting events,
as described in Section IV-D. If a solution contains a nonzero
As stated in Remark 1, a path will always start with a delay number of constraint violations, the DA-PA has to coordinate
transition, alternate between discrete and delay transitions, and with other agents to find a solution to the conflicting events
end with a discrete transition so that n = m in (6). prior to scheduling actions. More detail about coordination
Let ξi = (xi , ci , γi ) denote a shorthand for the state, clock is discussed in the following section. Finally, if a solution
values, and slack values respectively at the ith transition of to the above problem cannot be obtained or this optimization
the path s, s(i). Note that s(i) can be either a delay or a problem is not solved without a bounded time, the DA-PA will
discrete transition. To pose the path planning problem as an move to the scheduling step without a new plan. As described
optimization problem, we additionally define in Section IV-D, the lack of a new plan will force the DA-PA
D s(1) s(2) s(N )
E to either find a new plan or call an exit agent.
ϕ(s, ξ1 ) = ξ1 −−→ ξ2 , ξ2 −−→ ξ3 , . . . , ξN −−−→ ξN +1
𝒂𝒈𝑫𝑨#𝑷𝑨 𝒂𝒈𝒗 a solution was not found in a bounded time, the DA-PA can
try to find a sub-optimal optimal path for the decision-making
Model model. The methodology to find this sub-optimal path is not in
Creation the scope of this paper and will be considered in future work.
Path Another method that the DA-PA can use to find a new plan is
Planning
alt to try to capture some changes in the manufacturing system by
Soft Violation Request Violation re-exploring and re-building the environment model [14]. This
Constraint Deliberation method allows the DA-PA to address various disturbances,
Violations? Coordination
Respond to Request
faults, and failures in the manufacutring system [64]. Finally,
if a DA-PA is not able to find a solution within a bounded
alt
time or a bounded number of iterations, it will have to exit
Confirm (Adjust) the system via the exit plan by calling an exit agent provided
Authorized?
Violations Schedule
in its exit plan (described in Section III-A).
To:
Schedule E. Execution
RAs
to common conflict resolution approaches in existing multi- TABLE IV: Time limits for resource actions in case study 1:
agent systems for manufacturing that use a supervisor agent small manufacturing system simulation
to resolve conflicts, e.g., [6], [35]. One of the major benefits is Resource Program Time (ticks)
that it is not necessary to combine multiple, distinct, models Robot 1 Move to Buffer 2 15
from different PAs and RAs during cooperation. These various Robot 1 Move to Buffer 2 (Fast) 5
models may be difficult to combine since they may have been Robot 2 Move to Machine 1 17
gathered at different times and may represent different sets of Robot 2 Move to Machine 2 17
resources in the manufacturing system [14]. A second benefit Machine 1 Process 1 100
of this approach is that the optimization will scale reasonably Machine 2 Process 1 150
with more PAs in the system, as shown in Section V-D. Finally,
FSM class described in [13] was developed. This custom PTA
the DA-PA retains a lot of the information internally since
class was used to encode the DA-PA environment model, the
only violated constraints are shared with other agents. If a
DA-PA decision making model, and the capability model used
manufacturing system contains a large number of different PAs
by the RAs when providing capability and scheduling infor-
that represent various parts from different customers, it may be
mation during exploration (see [14] for a detailed explanation
highly beneficial and preferable for a customer to ensure that
of the exploration process). The communication framework
all of their process plans, models, and performance weights
native to Repast Simphony was used to encode messages and
are not shared with other agents.
enable cooperation between agents.
A Cost Optimal Reachability Analysis (CORA) solver was
V. C ASE S TUDIES developed to solve the optimization in (7). CORA has been
This section presents how direct, active cooperation can be previously used to find cost-optimal paths for PTA models.
used to improve the performance of manufacturing systems Two software tools that have been used to solve CORA for
using simulations of the manufacturing environment. The PTAs are UPPAAL CORA [51], [66] and satisfiability modulo
first case study describes how the cooperation framework is theories (SMT) solvers [67], [68]. A custom implementation
applied to the system introduced in Section III-C. The second of the z3 SMT solver was used to encode soft constraints
case study shows how the DA-PA can be used to improve and find paths in the DA-PA’s decision making model. A
the flexibility of a larger manufacturing environment. The detailed overview and description of the solver implementation
scalability study analyzes how the developed optimization is found in [63]. For this implementation, the dynamics and
methodology can be scaled to larger manufacturing systems. constraints of the decision making model were transformed
The case studies provided in this section overview how into first-order logic expressions. The solver output was the
the system controlled by a multi-agent system responds to a optimal path for the DA-PA and a set of constraint violations
new, unexpected high priority order. The benefit of using DA- used in the cooperation framework. JavaScript Object Notation
PAs for this specific unexpected disturbance is then discussed. (JSON) was as an interface between the Repast Simphony
Note that it is possible for the DA-PA to respond to other simulation and the z3 SMT solver.
unexpected disturbances and faults in the system, as detailed
in the supplementary material [64]. B. Case Study 1: Small Manufacturing System
A simulation of the manufacturing system introduced in
A. Simulation setup Section III-C is used to analyze the behaviour of DA-PAs in a
The Repast Simphony agent-based simulation platform [65] small manufacturing system environment. The system contains
was used to test and analyze the behavior of a DA-PA in a two robots (R1 and R2), two machines (M1 and M2), and two
manufacturing environment. This agent-based simulation soft- buffers (B1 and B2). New parts entering the system start at
ware has been previously used to test agent-based controllers buffer 1 (B1) and, once the part enters the system, a new DA-
for manufacturing systems [13], [34]. Repast Simphony is a PA is initialized. During initialization, the DA-PA is provided
discrete-time simulation platform, as the simulation is updated the process plan and deadlines, the exit plan, the performance
every time step (also known as every tick). weights, and the priority for the physical properties in the
The developed manufacturing simulation contains high- process plan.
level discrete-event dynamics for material handling robots and In this example, the process plan contains one physical
machines used for manufacturing. The proposed cooperation property: complete process 1 (P1). The exit plan is an agent
framework extended the simulation of the model-based PAs that can take the part associated with the DA-PA out of the
described in [13]. In this updated simulation, each PA builds system. For this example, the DA-PAs attempts to minimize
and uses a finite state machine (FSM) model of a local both the time and energy expenditure in the system with
neighborhood of the manufacturing system to make decisions. αt = 1 and αe = 0.5, respectively. To test the DA-PA
The PAs build the FSM by exploring the manufacturing system cooperation framework, various priority values and deadlines
through queries to the RAs in the system, as described in [14]. were provided to the DA-PAs during initialization, as detailed
Each RA provides an FSM that corresponds to its individual in the following examples.
capabilities and the DA-PA combines this information into a DA-PAs use the exploration methodology described in Sec-
PTA that represents the capabilities of the local manufacturing tion V-A to send requests to RAs in the system to start the co-
environment [13], [14]. A custom PTA class that extends the operation process described in Section IV. For example, a PTA
13
2
an importance value of 2. The optimal path for agpa does not
2
have any constraint violations. Thus, agpa schedules resource
actions to take the associated part to the slower machine, M2,
1
to complete its process plan. The flow times for the agpa and
Fig. 5: Set-up for case study 2: semiconductor manufacturing. 2
agpa to complete their process plans are 141 and 197 ticks,
that is created by a DA-PA for a part in the system without any respectively.
work-in-progress is shown in Figure 2 and described in detail 2) DA-PA to DA-PA cooperation with order prioritization:
in Section III-C. The time limits for the resource actions in the For this example, the manufacturer prioritizes the completion
2
simulation are shown in Table IV. The time limits in Table IV of the part associated with agpa over other parts in the system.
1 2
are used to build the constraints in the decision making model Both agpa and agpa are set-up in the same way as in the
2
shown in Table II from Section III-C. Note that for robot 1, previous example. However, agpa is given an importance value
1
the energy expenditure is much greater if the soft constraint is of 1, making it higher priority than agpa . As in the previous
1
violated, i.e. if robot 1 has to move the part to B2 faster than example, agpa initially chooses to go to M1 to complete the
1
15 ticks. Thus, cnm = 10 kWh and csf t = 100 kwH are the process plan. After 5 ticks (when the part associated with agpa
nrg
costs for staying at state x2 , Knm (x2 ) = cnm · val(cxl 2 ) and is using Robot 1), the more important part enters the system
nrg x2 2
Ksf t (x2 ) = csf t · val(γl ). To conform with the simulation and agpa is initialized. By following the proposed cooperation
2 1 1
environment, we require the part to stay in every state for at framework, agpa requests agpa to reschedule its plan. agpa
least one tick. Therefore, an additional constraint, val(cxl ) ≥ 1, authorizes the request after finding a feasible path that takes
is added for each state in the decision making model, x ∈ X. the associated part to M2 to complete P1. In this scenario,
1 2
The following three examples illustrate the benefits of direct, the flow times for the agpa and agpa are 221 and 145 ticks,
active cooperation. respectively. Compared to the first example, the process plan
2
1) Non-cooperative PAs: This example illustrates the sce- of agpa was completed faster, even though the total flow time
nario when two DA-PAs do not need to cooperate with each of the two parts is longer than the first example. Note that, even
other to complete their individual goals and satisfy the system with the increased flow time, both agents still accomplished
objectives. In this example, a part comes into the system with their process plan successfully.
1 3) DA-PA to DA-PA cooperation to meet deadlines: Both
no deadline provided for P1. A DA-PA, agpa , is created with
1 2
an importance value of 2. The DA-builds the decision making agpa and agpa are set-up in the same way as Example 1.
model shown in Figure 2 and calculates its optimal path by Unlike example 2, the priority of both parts is the same for this
2
solving (7). The solver described in the previous section is example. However, agpa is given a deadline, or hard constraint,
1
used to find a cost-optimal path in the system. Then, after of 180 ticks to complete P1. As in examples 1 and 2, agpa
1 initially chooses to go to M1 to complete the process plan.
solving the optimization problem, agpa schedules the path
2
that consists of delay and discrete transitions to complete However, when the part associated with agpa comes into the
2 1
P1 at M1. Examples of delay transitions that a DA-PA can system, agpa requests the utilization of M1 from agpa . This
1
request include “stay at B1 for 1 tick” from buffer 1 agent, request is authorized by agpa after it finds another feasible
“stay moving to B2 for 15 ticks“ from the robot 1 agent, path in the system (complete P1 at M2). The flow times for
1 2
or “work on process 1 for 100 ticks” from the machine 1 agpa and agpa are 221 and 145 ticks, respectively. The process
agent. Examples of discrete transitions that a DA-PA can plans for both DA-PAs, including the deadline constraint of
2
request include “start moving to B2/M1/M2“ from the robot 180 ticks for agpa , are accomplished successfully due to the
agents or “start p1 at M1/M2” from the machine agents. These cooperation between the two DA-PAs.
discrete, delay transitions allow the DA-PA to communicate 4) Example 4 - DA-PA to RA cooperation to meet a dead-
with RAs, enabling the DA-PA to make dynamic decisions line: Only one DA-PA is considered for this example. In this
for the associated part and accomplish the desired objectives example, a part comes into the system and DA-PA is initialized
1
in the manufacturing system. with a deadline of 135 ticks. To accomplish this task, agpa
A second part comes into the system 10 ticks after the first cooperates with the robot 1 (R1) RA and requests to move to
part. This is an unexpected disturbance for the system that B2 fast (in 5 ticks), violating R1’s soft constraint. Since this
1 1
was not captured by agpa during its initial model creation will ensure that agpa meets its deadline, the RA confirms this
1
and, therefore, not considered by agpa during path planning. violation at the expense of a higher energy expenditure. In this
2 1
The second DA-PA, agpa , does not have deadlines and has scenario, the flow time for agpa is 135 ticks and the DA-PA
14
D. Scalability Study
We perform a computational study to illustrate the scala-
bility of the proposed optimization approach for solving (7).
Fig. 8: Distribution of individual part flow times for order B In this study, we utilize the PTA shown in Figure 6, which
in the semiconductor manufacturing case study. is created by the DA-PA associated with the first part in
Twenty five trials were conducted to test the effect of direct Order A. We study the case with a fixed number of RAs and
and active PA to PA cooperation. Each set of five trials had an increasing number of DA-PAs in the system. To simulate
a different number of order B DA-PAs that would request the effect of increasing DA-PAs, we implement an increasing
violations from order A DA-PAs. The number of order B DA- number of interval gap constraints (see (2)) and solve the
PAs that request violations was 0, 2, 4, 6, 8, and 10 for each optimization problem outlined in (7).
set of five trials, respectively. The results of the scaling study are presented in Figure 9.
For every trial, the average flow time of the 10 parts of The simulation runs are repeated 10 times for each number
order A and order B was calculated. Figure 7 shows the of constraints shown on the horizontal axis of Figure 9. All
distribution for the average flow time of orders A and B based simulations are performed on a personal laptop computer with
on the number of cooperative products in order B. Each data Intel Core i7-7700HQ CPU. The longest path of the PTA
point in Figure 7 represents the minimum, maximum, and in Figure 6 includes 14 discrete transitions and a horizon
mean of the average flow time for 5 trials. Note that as more length corresponding to 16 discrete transitions is used in all
cooperation is allowed for order B DA-PAs, the average flow simulations. The horizon of 16 discrete transitions corresponds
time of these parts decreases. However, as order B DA-PAs to N = 32 for (7). The nominal optimization problem without
utilize the resources required by order A DA-PAs, the average the injected interval gap constraints has 52 bound constraints
flow time of order A parts increases. Thus, direct and active in total, which takes 0.73 ± 0.22 seconds to compute. The
cooperation will allow higher priority parts to be completed results show that the number of constraints has a pronounced
quicker, but slow down the production of lower priority parts. effect on the computation time only after approximately 5000
Figure 8 shows the resulting flow times for order B parts for interval gap constraints, which results in over 3 seconds of
all of the trials. Note that Figure 8 represents the distribution computation time to solve (7). The computation time remains
of flow time for individual order B parts, while Figure 7 well below 30 seconds for more than 105 constraints. Each
represents the distribution of average flow times for orders DA-PA typically requests 10-50 constraints when scheduling
A and B in the system. On average, order B DA-PAs with events in the semiconductor manufacturing case study de-
cooperation capabilities had a flow time of 1238 ticks. Order scribed in Section V-C. Therefore, 105 interval constraints
B DA-PAs that did not cooperate had a flow time of 1354 would correspond to somewhere between 103 to 104 DA-
ticks. Thus, the DA-PAs that were able to cooperate reduced PAs in the system. At these scales, both communication
16
and physical bottlenecks may be more pronounced than the [5] A. M. Farid and L. Ribeiro, “An Axiomatic Design of a Multiagent
computational bottleneck caused by solving (7). Reconfigurable Mechatronic System Architecture,” IEEE Transactions
on Industrial Informatics, vol. 11, no. 5, pp. 1142–1155, 2015.
[6] N. K. C. C. Krothapalli and A. V. V. Deshmukh, “Design of negotiation
protocols for multi-agent manufacturing systems,” International Journal
VI. C ONCLUSIONS AND F UTURE W ORK of Production Research, vol. 37, no. 7, pp. 1601–1624, 1999.
[7] T. N. Wong, C. W. Leung, K. L. Mak, and R. Y. Fung, “Dynamic
Multi-agent control has been proposed to improve the flexi- shopfloor scheduling in multi-agent manufacturing systems,” Expert
bility of manufacturing systems. One of the primary agents in Systems with Applications, vol. 31, no. 3, pp. 486–494, 2006.
this control strategy is the product agent. The product agent [8] P. Leitao, A. W. Colombo, and F. Restivo, “An Approach to the Formal
Specification of Holonic Control Systems,” in First International Con-
must cooperate with other product agents and resource agents ference on Industrial Applications of Holonic and Multi-Agent Systems,
in the system when making scheduling and planning decisions. 2003, pp. 59–70.
However, existing product agent cooperation strategies have [9] J. Barbosa, P. Leitão, E. Adam, and D. Trentesaux, “Dynamic self-
organization in holonic multi-agent manufacturing systems: The ADA-
been primarily passive, reducing the flexibility and adaptability COR evolution,” Computers in Industry, vol. 66, pp. 99–111, 2015.
of the product agent and, in turn, of this control strategy. [10] C. Legat, D. Schütz, and B. Vogel-Heuser, “Automatic generation of
A model-based decision making framework to enable direct, field control strategies for supporting (re-)engineering of manufacturing
systems,” Journal of Intelligent Manufacturing, vol. 25, no. 5, pp. 1101–
actively cooperative product agents is introduced. For this 1111, 2014.
framework, a new environment model that leverages the priced [11] M. K. Lim, Z. Zhang, and W. Goh, “An iterative agent bidding
timed automata modeling formalism is proposed. This model mechanism for responsive manufacturing,” Engineering Applications of
Artificial Intelligence, vol. 22, no. 7, pp. 1068–1079, 2009.
explicitly represents the scheduling constraints in the system [12] S. Rehberger, L. Spreiter, and B. Vogel-Heuser, “An agent-based ap-
and separates the desirable, but flexible, time limits (soft proach for dependable planning of production sequences in automated
constraints) from the physical/safety time limits (hard con- production systems,” At-Automatisierungstechnik, vol. 65, no. 11, pp.
766–778, 2017.
straints). An optimization problem is formulated that allows [13] I. Kovalenko, D. Tilbury, and K. Barton, “The model-based product
the direct and active product agent to identify any conflicting agent: A control oriented architecture for intelligent products in multi-
scheduling constraints. A strategy for product agent coordina- agent manufacturing systems,” Control Engineering Practice, vol. 86,
no. March, pp. 105–117, may 2019.
tion and negotiation with other agents in the system is also [14] I. Kovalenko, D. Ryashentseva, B. Vogel-Heuser, D. Tilbury, and
presented. This strategy resolves the conflicts in the system, K. Barton, “Dynamic Resource Task Negotiation to Enable Product
allowing the agents to effectively control and coordinate the Agent Exploration in Multi-Agent Manufacturing Systems,” Robotics
and Automation Letters, vol. 4, no. 3, pp. 2854–2861, jun 2019.
components on the factory floor. This framework is tested in [15] K. Barton, F. Maturana, and D. Tilbury, “Closing the Loop in IoT-
a simulated manufacturing environment, showing how direct, enabled Manufacturing Systems: Challenges and Opportunities,” in
active cooperation can improve the flexibility and performance ACC. IEEE, 2018, pp. 5503–5509.
[16] F. Lopez, Y. Shao, Z. M. Mao, J. Moyne, K. Barton, and D. Tilbury,
of manufacturing systems. “A software-defined framework for the integrated management of smart
Future work will include the following: developing and manufacturing systems,” Manufacturing Letters, vol. 15, pp. 18–21,
incorporating new process plan formalisms to capture a larger 2018.
[17] J. Ota, “Multi-agent robot systems as distributed autonomous systems,”
variety of production orders; creating methods to effectively Advanced engineering informatics, vol. 20, no. 1, pp. 59–70, 2006.
estimate soft constraint violations costs for product and re- [18] J. A. Fax and R. M. Murray, “Information flow and cooperative control
source agents; developing algorithms for agent violation de- of vehicle formations,” IEEE transactions on automatic control, vol. 49,
no. 9, pp. 1465–1476, 2004.
liberation after a violation request; and integrating further [19] S. D. J. McArthur, E. M. Davidson, V. M. Catterson, A. L. Dimeas,
time-based specifications in the product agent process plan. N. D. Hatziargyriou, F. Ponci, and T. Funabashi, “Multi-Agent Systems
Further simulation testing of the proposed methodology will for Power Engineering Applications-Part I: Concepts, Approaches, and
Technical Challenges,” IEEE Transactions on Power Systems, vol. 22,
include incorporating hard constraints for larger manufacturing no. 4, pp. 1743–1752, 2007.
systems, varying the number of allowable constraint violations, [20] A. Kantamneni, L. E. Brown, G. Parker, and W. W. Weaver, “Survey of
and increasing the number of different priority orders in the multi-agent systems for microgrid control,” Engineering applications of
artificial intelligence, vol. 45, pp. 192–203, 2015.
system. The direct, actively cooperating product agent will [21] J. Du, V. Sugumaran, and B. Gao, “Rfid and multi-agent based architec-
also be integrated into a physical manufacturing testbed that ture for information sharing in prefabricated component supply chain,”
run agent-based controllers to test the proposed framework IEEE Access, vol. 5, pp. 4132–4139, 2017.
[22] F. Hsieh, “Dynamic configuration and collaborative scheduling in supply
with real industrial hardware and software. chains based on scalable multi-agent architecture,” Journal of Industrial
Engineering International, vol. 15, pp. 249–269, 2019.
[23] N. Mishra, A. Singh, S. Kumari, K. Govindan, and S. I. Ali, “Cloud-
R EFERENCES based multi-agent architecture for effective planning and scheduling
of distributed manufacturing,” International Journal of Production Re-
[1] D. M. Tilbury, “Cyber-Physical Manufacturing Systems,” Annual Review search, vol. 54, no. 23, pp. 7115–7128, 2016.
of Control, Robotics, and Autonomous Systems, vol. 2, no. 1, 2019. [24] P. Leitao, S. Karnouskos, L. Ribeiro, J. Lee, T. Strasser, and A. W.
[2] P. Leitão, “Agent-based distributed manufacturing control: A state- Colombo, “Smart agents in industrial cyber–physical systems,” Proceed-
of-the-art survey,” Engineering Applications of Artificial Intelligence, ings of the IEEE, vol. 104, no. 5, pp. 1086–1101, 2016.
vol. 22, no. 7, pp. 979–991, 2009. [25] W. Shen, L. Wang, and Q. Hao, “Agent-based distributed manufactur-
[3] B. Vogel-Heuser, J. Lee, and P. Leitão, “Agents enabling cyber-physical ing process planning and scheduling: A state-of-the-art survey,” IEEE
production systems,” At-Automatisierungstechnik, vol. 63, no. 10, pp. Transactions on Systems, Man and Cybernetics Part C: Applications and
777–789, 2015. Reviews, vol. 36, no. 4, pp. 563–577, 2006.
[4] P. Vrba, P. Tichý, V. Mařı́k, K. H. Hall, R. J. Staron, F. P. Maturana, [26] P. Leitão, V. Mařı́k, and P. Vrba, “Past, present, and future of industrial
and P. Kadera, “Rockwell automation’s holonic and multiagent control agent applications,” IEEE Transactions on Industrial Informatics, vol. 9,
systems compendium,” IEEE Transactions on Systems, Man and Cy- no. 4, pp. 2360–2372, 2012.
bernetics Part C: Applications and Reviews, vol. 41, no. 1, pp. 14–30, [27] W. Lepuschitz, A. Zoitl, M. Vallee, and M. Merdan, “Toward self-
2010. reconfiguration of manufacturing systems using automation agents,”
17
IEEE Transactions on Systems, Man and Cybernetics Part C: Appli- Journal of Manufacturing Technology and Management, vol. 8, no. 1-3,
cations and Reviews, vol. 41, no. 1, pp. 52–69, 2011. pp. 37–57, 2006.
[28] D. McFarlane, V. Giannikas, A. C. Y. Wong, and M. Harrison, “Product [50] K. Larsen, G. Behrmann, E. Brinksma, A. Fehnker, T. Hune, P. Pet-
intelligence in industrial control: Theory and practice,” Annual Reviews tersson, and J. Romijn, “As cheap as possible: Efficient cost-optimal
in Control, vol. 37, no. 1, pp. 69–88, 2013. reachability for priced timed automata,” Lecture Notes in Computer
[29] W. Dai, H. Nishi, V. Vyatkin, V. Huang, Y. Shi, and X. Guan, “Industrial Science (including subseries Lecture Notes in Artificial Intelligence and
edge computing: Enabling embedded intelligence,” IEEE Industrial Lecture Notes in Bioinformatics), vol. 2102, pp. 493–505, 2001.
Electronics Magazine, vol. 13, no. 4, pp. 48–56, 2019. [51] G. Behrmann, K. G. Larsen, and J. I. Rasmussen, “Priced timed
[30] L. Panait and S. Luke, “Cooperative Multi-Agent Learning: The State of automata: Algorithms and applications,” in International Symposium on
the Art,” Autonomous Agents and Multi-Agent Systems, vol. 3, no. 11, Formal Methods for Components and Objects. Springer, 2004, pp.
pp. 387–434, 2005. 162–182.
[31] M. Owliya, M. Saadat, G. G. Jules, M. Goharian, and R. Anane, [52] D. L. Pepyne and C. G. Cassandras, “Control of hybrid systems in
“Agent-based interaction protocols and topologies for manufacturing manufacturing,” Proceedings of the IEEE, vol. 88, no. 7, pp. 1108–1122,
task allocation,” IEEE Transactions on Systems, Man, and Cybernetics: 2000.
Systems, vol. 43, no. 1, pp. 38–52, 2012. [53] F. Ocker, I. Kovalenko, K. Barton, D. Tilbury, and B. Vogel-Heuser,
[32] M. M. Polycarpou, Y. Yang, and K. M. Passino, “Cooperative control “A Framework for Automatic Initialization of Multi-Agent Production
of distributed multi-agent systems,” IEEE Control Systems Magazine, Systems Using Semantic Web Technologies,” IEEE Robotics and Au-
vol. 21, no. June 2001, pp. 1–27, 2001. tomation Letters, vol. 4, no. 4, pp. 4330–4337, jul 2019.
[33] J. Leitner, “Multi-robot cooperation in space: A survey,” Proceedings - [54] G. Zheng, I. Kovalenko, K. Barton, and D. Tilbury, “Integrating Hu-
2009 Advanced Technologies for Enhanced Quality of Life, AT-EQUAL man Operators into Agent-based Manufacturing Systems: A Table-top
2009, pp. 144–151, 2009. Demonstration,” Procedia manufacturing, vol. 17, pp. 326–333, 2018.
[34] G. Jules and M. Saadat, “Agent cooperation mechanism for decentralized [55] I. Kovalenko, D. Tilbury, and K. Barton, “Priced Timed Automata
manufacturing scheduling,” IEEE Transactions on Systems, Man, and Models for Control of Intelligent Product Agents in Manufacturing
Cybernetics: Systems, vol. 47, no. 12, pp. 3351–3362, 2016. Systems,” in 15th Workshop on Discrete Event Systems (WODES 2020),
[35] S. Wang, J. Wan, D. Zhang, D. Li, and C. Zhang, “Towards smart factory 2020.
for industry 4.0: a self-organized multi-agent system with big data based [56] C. J. Myers and T.-Y. Meng, “Synthesis of timed asynchronous circuits,”
feedback and coordination,” Computer networks, vol. 101, pp. 158–168, IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
2016. vol. 1, no. 2, pp. 106–119, 1993.
[36] Y. Sallez, T. Berger, and D. Trentesaux, “A stigmergic approach for [57] B. R. Ferrer, B. Ahmad, A. Lobov, D. A. Vera, J. L. M. Lastra,
dynamic routing of active products in FMS,” Computers in Industry, and R. Harrison, “An approach for knowledge-driven product, process
vol. 60, no. 3, pp. 204–216, 2009. and resource mappings for assembly automation,” IEEE International
Conference on Automation Science and Engineering, vol. 2015-Octob,
[37] K. Hadeli, P. Valckenaers, M. Kollingbaum, and H. Van Brussel, “Multi-
pp. 1104–1109, 2015.
agent coordination and control using stigmergy,” Computers in Industry,
[58] A. Bemporad and M. Morari, “Control of systems integrating logic,
vol. 53, no. 1, pp. 75–96, 2004.
dynamics, and constraints,” Automatica, vol. 35, no. 3, pp. 407–427,
[38] C. Hofmann, C. Krahe, N. Stricker, and G. Lanza, “Autonomous
1999.
production control for matrix production based on deep q-learning,”
[59] J. F. Pétin, D. Gouyon, and G. Morel, “Supervisory synthesis for
Procedia CIRP, vol. 88, pp. 25–30, 2020.
product-driven automation and its application to a flexible assembly
[39] A. Lüder, A. Klostermeyer, J. Peschke, A. Bratoukhine, and T. Sauter,
cell,” Control Engineering Practice, vol. 15, no. 5, pp. 595–614, 2007.
“Distributed automation: PABADIS vs. HMS,” IEEE Transactions on
[60] S. Rehberger, L. Spreiter, and B. Vogel-Heuser, “An Agent Approach to
Industrial Informatics, vol. 1, no. 1, pp. 31–38, 2005.
Flexible Automated Production Systems Based on Discrete and Contin-
[40] M. G. Marchetta, F. Mayer, and R. Q. Forradellas, “A reference uous Reasoning,” 2016 IEEE International Conference on Automation
framework following a proactive approach for Product Lifecycle Man- Science and Engineering (CASE), pp. 1249–1256, 2016.
agement,” Computers in Industry, vol. 62, no. 7, pp. 672–683, 2011. [61] A. Liu, H. Liu, B. Yao, W. Xu, and M. Yang, “Energy consumption
[41] H. Tang, D. Li, S. Wang, and Z. Dong, “CASOA: An Architecture for modeling of industrial robot based on simulated power data and param-
Agent-Based Manufacturing System in the Context of Industry 4.0,” eter identification,” Advances in Mechanical Engineering, vol. 10, no. 5,
IEEE Access, vol. 6, pp. 12 746–12 754, 2017. p. 1687814018773852, 2018.
[42] P. G. Ansola, A. Garcı́a, J. de las Morenas, and J. de las Morenas, [62] J. Lv, R. Tang, W. Tang, Y. Liu, Y. Zhang, and S. Jia, “An investigation
“Aligning Decision Support with Shop Floor Operations: A Proposal into reducing the spindle acceleration energy consumption of machine
of Intelligent Product Based on BDI Physical Agents,” in Service tools,” Journal of Cleaner Production, vol. 143, pp. 794–803, 2017.
Orientation in Holonic and Multi-agent Manufacturing, T. Borangiu, [63] E. C. Balta, I. Kovalenko, I. A. Spiegel, D. M. Tilbury, and K. Barton,
A. Thomas, and D. Trentesaux, Eds. Cham: Springer International “Model predictive control of priced timed automata encoded with first-
Publishing, 2015, pp. 91–100. order logic,” IEEE Transactions on Control Systems Technology, 2021.
[43] J. De Las Morenas, A. Garcia-Higuera, and P. Garcia-Ansola, “Shop [64] I. Kovalenko, E. C. Balta, D. M. Tilbury, and K. Barton,
Floor Control: A Physical Agents Approach for PLC-Controlled Sys- “Details About Agent Response in “Cooperative Product Agents to
tems,” IEEE Transactions on Industrial Informatics, vol. 13, no. 5, pp. Improve Manufacturing System Flexibility: A Model-Based Decision
2417–2427, 2017. Framework”,” University of Michigan Barton Research Group,
[44] H. Kaindl, M. Vallée, and E. Arnautovic, “Self-representation for self- Tech. Rep., 2021. [Online]. Available: https://brg.engin.umich.edu/wp-
configuration and monitoring in agent-based flexible automation sys- content/uploads/sites/131/2022/02/DA-PA-agentResponse.pdf
tems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, [65] M. J. North, N. T. Collier, J. Ozik, E. R. Tatara, C. M. Macal, M. Bragen,
vol. 43, no. 1, pp. 164–175, 2012. and P. Sydelko, “Complex adaptive systems modeling with Repast
[45] C. G. Cassandras and S. Lafortune, Introduction to discrete event Simphony,” Complex Adaptive Systems Modeling, vol. 1, no. 1, p. 3,
systems, 2nd ed. Springer Science & Business Media, 2009. 2013.
[46] I. Kovalenko, K. Barton, and D. Tilbury, “Design and Implementation of [66] G. Behrmann, “UPPAAL CORA,” 2014. [Online]. Available:
an Intelligent Product Agent Architecture in Manufacturing Systems,” in http://people.cs.aau.dk/ adavid/cora/index.html
IEEE Emerging Technology and Factory Automation (ETFA), jan 2017, [67] D. Bhave, S. N. Krishna, and A. Trivedi, “On Nonlinear Prices in Timed
pp. 1–8. Automata,” Electronic Proceedings in Theoretical Computer Science,
[47] E. C. Balta, Y. Lin, K. Barton, D. M. Tilbury, and Z. M. Mao, vol. 232, p. 65, 2016.
“Production as a Service: A Digital Manufacturing Framework for [68] M. Hekmatnejad, G. Pedrielli, and G. Fainekos, “Task scheduling with
Optimizing Utilization,” IEEE Transactions on Automation Science and nonlinear costs using SMT solvers,” IEEE International Conference on
Engineering, pp. 1–11, 2018. Automation Science and Engineering, vol. 2019-August, pp. 183–188,
[48] S. Baer, J. Bakakeu, R. Meyes, and T. Meisen, “Multi-agent reinforce- 2019.
ment learning for job shop scheduling in flexible manufacturing sys-
tems,” in 2019 Second International Conference on Artificial Intelligence
for Industries (AI4I). IEEE, 2019, pp. 22–25.
[49] P. Leitão, A. W. Colombo, and F. Restivo, “A formal specification ap-
proach for holonic control systems: The ADACOR case,” International
18
Ilya Kovalenko (Member, IEEE) received the B.S. Kira Barton (Senior Member, IEEE) received the
degree in mechanical engineering from the Georgia B.Sc. degree in mechanical engineering from the
Institute of Technology, Atlanta, GA, USA, in 2015, University of Colorado at Boulder, in 2001, and
and the M.S. and Ph.D degrees in mechanical engi- the M.Sc. and Ph.D. degrees in mechanical engi-
neering from the University of Michigan, Ann Arbor, neering from the University of Illinois at Urbana-
MI, USA, in 2018 and 2020, respectively. He held Champaign, in 2006 and 2010, respectively. She held
a postdoctoral research position at the University a postdoctoral research position at the University of
of Michigan, from Fall 2020 until Fall 2021. In Illinois, from Fall 2010 until Fall 2011, at which
January 2022, he joined the Department of Mechan- point she joined the Mechanical Engineering De-
ical Engineering and the Department of Industrial partment at the University of Michigan, Ann Arbor.
& Manufacturing Engineering at the Pennsylvania She is currently an Associate Professor with the
State University, where he currently holds the position of Assistant Professor. Department of Mechanical Engineering and a Core Robotics Faculty Member
His research interests include control theory and application, with a focus on at the University of Michigan. She conducts research in modeling, sensing,
multi-agent systems, advanced manufacturing, cyber-physical systems, and and control for applications in advanced manufacturing and robotics, with
robotics. a specialization in cooperative learning and smart manufacturing. She was
selected as a Miller Faculty Scholar from 2018-2021, and was a recipient
of an NSF CAREER Award, in 2014, the 2015 SME Outstanding Young
Manufacturing Engineer Award, the 2015 University of Illinois, Department
of Mechanical Science and Engineering Outstanding Young Alumni Award,
the 2016 University of Michigan, Department of Mechanical Engineering
Department Achievement Award, and the 2017 ASME Dynamic Systems and
Control Young Investigator Award.