Network Fusion: Pascal Fradet ST Ephane Hong Tuan Ha
Network Fusion: Pascal Fradet ST Ephane Hong Tuan Ha
Pascal Fradet
INRIA Rhne-Alpes o 655, av. de lEurope, 38330 Montbonnot, France
Pascal.Fradet@inria.fr
ABSTRACT
Modular programming enjoys many well-known advantages: readability, maintainability, separate development and compilation. However, the composition of modular units (components) suers eciency problems. In this paper, we propose an invasive composition method which strives to reconcile modularity and eciency. Our technique, network fusion, automatically merges networks of interacting components into equivalent sequential programs. We provide the user with an expressive language to specify scheduling constraints which can be taken into account during network fusion. Fusion allows to replace internal communications by assignments and alleviates most time overhead. We present our approach in a generic and unied framework based on labeled transition systems, static analysis and transformation techniques.
of Kpns and scheduling constraints are both formalized as guarded labeled transition systems (Lts). Network fusion is an automatic process which takes a Kpn and scheduling constraints and yields a sequential program respecting the constraints. Note that constraints may introduce articial deadlocks, in which case the user will be warned. The resulting program must be equivalent to the Kpn modulo the possible deadlocks introduced by constraints. Fusion alleviates most time overhead by allowing the suppression of context switches, the replacement of internal communications by assignments to local variables and optimizations of the resulting sequential code using standard compiling techniques. The four main steps of the fusion process are represented in Figure 1. The rst step is the abstraction of the network into a nite model called an Abstract Execution Graph (Aeg). An Aeg over-approximates the set of possible executions traces. We do not present this step in details since it relies on very standard analysis techniques (e.g., abstract interpretation) and many dierent abstractions are possible depending on the desired level of precision. Instead, we focus on the properties that an Aeg must satisfy. The second step consists in enforcing constraints. This is expressed as a synchronized product between guarded Lts (the Aeg and the constraints). In general, this step does not sequentialize completely the execution and leaves scheduling choices. The third step completes the scheduling of the constrained Aeg. Several strategies can be used as long as they are fair. Again, these strategies can be expressed as guarded Lts and scheduling as a synchronized product. The fourth step, concretization, maps the scheduled (serialized) Aeg to a single sequential program. Further transformations (e.g., standard optimizations) can then be carried out on the resulting program. The paper is organized as follows. Section 2 presents the syntax and semantics of Kpns. Section 3 describes Aegs and denes the abstraction and concretization steps which both relate Aeg to concrete models (programs and Kpns). Section 4 presents the language of constraints and the two main transformation steps of fusion: constraints enforcement and
1.
INTRODUCTION
Modular programming enjoys many well-known advantages: readability, maintainability, separate development and compilation. However, the composition of modular units (components) suers eciency problems. Sequential composition poses space problems: the producer delivers its complete output before the consumer starts. Parallel composition relies on threads, synchronization and context switches which introduce time overhead. In this paper, we propose an invasive composition method, network fusion, which strives to reconcile modularity and eciency. Our technique automatically merges networks of interacting components into equivalent sequential programs. Our approach takes two source inputs: a network of components and user-dened scheduling constraints. Networks are formalized as Kahn Process Networks (Kpns) a simple formal, yet expressive, model of parallelism [7]. Kpns have straightforward and determinate operational and denotational semantics. Scheduling constraints allow the user to make more precise the scheduling strategy by specifying a set of desired executions. The operational semantics
scheduling. We propose three extensions of the basic technique in Section 5 and, nally, we review related work and conclude in Section 6. We have chosen to present fusion in an intuitive and mostly informal way. In particular, we do not provide any correctness proofs. They would require a complete description of the operational semantics of Kpn too long to t the space limits of this article. We are currently working on a companion paper which will include a complete formalization and proofs.
The components we consider in this paper represent valid, sequential and deterministic programs. They have the following restrictions: A component has a unique entry point denoted by the label l0 . All the labels used in p are dened in the lhs of commands. Two commands having the same label have also mutually exclusive guards. The program P in Figure 2 sends the set N in increasing order on channel f . Program C assigns a with the value read on channel f if a < b or assigns b with the value read on the channel otherwise. Then, it sends a + b on the channel o and loops. Note that we omit guards when they are true. The semantics of a component p can be expressed as a Lts (p , (l0 , s0 ), Ep , p ) where p is an (innite) set of states (l, s) where l a label and s a store mapping variables to their values. (l0 , s0 ) is the initial state made of the initial label l0 and store s0 . We assume that the initial label is always indexed by 0 and that the initial store initializes integer variables by the value 0, Ep is the set of commands of p, p is the transition relation (actually, a function) on states labeled with the current command. The initial labels of program P and C (Figure 2) are p0 and c0 respectively and the variables x, a and b are initialized to 0. To simplify the presentation, we consider only programs which never terminate. Termination could always be represented by a nal looping command of the form lend : skip ; lend In the remaining, we use c|g and c|a to denote the guard and the action of the command c respectively.
#"
#"
Constrained Aeg
Scheduling Sec. 4.3
Scheduled Aeg
#" !
: source
: intermediate result
Concretization Sec. 3
: transformation step
Sequential Program
2.2
Networks of Components
2.
NETWORKS
We start by providing the syntax of components and networks. We outline its semantics and provide some intuition using an example. A complete structural operational semantics for Kpns can be found in [6].
2.1
Basic Components
l1 : g | a ; l2
A Kpn k is made of a set of processes {p1 , . . . , pn } executed concurrently. Networks are build by connecting output channels to input channels of components. Such channels are called internal channels whereas the remaining (unconnected) channels are the input and output channels of the network. The communication on internal channels is asynchronous (non blocking writes, blocking reads) and is modeled using unbounded fos. In order to guarantee a deterministic behavior, Kpns require the following conditions [7]: An internal channel is written by exactly one process and read from exactly one process. An input channel is read from exactly one component (and written by none). An output channel is written by exactly one component (and read from none). A component cannot test the absence of values on channels.
where l1 and l2 denote labels, g a guard and a an action. An action is either a read operation on an input channel f ?x, a write operation on an output channel f !x, or an internal action i (left unspecied). A component (or process) p is a set of commands {c1 , . . . , cn }. If the current program point of a component p is l1 , if l1 : g | a ; l2 is a command of p and the guard g is true, then the action a can be performed and the program point becomes l2 .
Semantics
(p0 , c0 ) x 0 Assuming inputs: a 0 1 : ... b 0
c0 c0 C= c1
x:=x+1
3 3 3
ab|?b 3
3 3
(p0 , c1 ) x 0 a 0 b 1
Network
G C
o
(p0 , c0 ) x 1 a 0
f !x
3 3 3
ab|?b 3
3 3
x:=x+1
3 3 3
o!(a+b) 3
3 3
(p1 , c1 ) x 1 a 0 b 1 f
(p0 , c0 ) x 0 a 0 b 1 f
b 0 f 1 :
Figure 2: A Simple Kpn and its trace semantics In order to simplify technical developments, we assume that networks have a single input and output channels denoted by and o respectively and that the input channel never remains empty. The global execution state of a Kpn is called a conguration. It is made of the local state of each component and the internal channel states i.e., nite sequences of values v1 : . . . : vn : . The operational semantics of Kpn can be expressed as a Lts (k , 0 , Ek , k ) where k is a (innite) set of congurations, the initial conguration 0 is such that each component is in its initial state and each internal channel is empty, Ek is the union of the sets of commands of components; these sets are supposed disjoint, the transition relation k is dened as performing (non deterministically) any enabled command of any process. A command is enabled when the current program point is its lhs label, its guard is true in the current conguration/state and it is not a blocking read (i.e., a read on an empty channel). The transition relation gives rise to an innite graph representing all the possible execution traces. A small part of the transition relation p for our example is depicted in Figure 2. Here, no global deadlock is possible and all traces are innite. An innite execution trace is said to be fair if any enabled action at any point in the trace is eventually executed. The denotational semantics of a Kpn is given by the function from the input values (the input channel) to the output values (the output channel) generated by fair executions. We will write Traces(k) and IO(k) to denote the set of traces and the denotational semantics of the Kpn k respectively. Kpns of deterministic components are deterministic [7]. Also, all fair executions with the same input yield the same output [6]. An important corollary for us is that Kpns are serializable: they can always be implemented sequentially.
3.
Network fusion necessitates to nd statically a safe and sequential scheduling. This step relies upon an abstract execution graph (Aeg), a nite model upper-approximating all the possible executions of the Kpn. We present in this section the key properties than an Aeg should satisfy and present an example. An Aeg k is a nite Lts (k , 0 , Ek , k ) with: k a nite set of abstract congurations, 0 is the initial abstract conguration, Ek a (nite) set of commands, k a labeled transition relation. The idea behind abstraction is to summarize in an abstract conguration a (potentially innite) set of concrete congurations [9]. This set is given by the function conc : k P(k ) dened as: conc( ) = { | }
where is a safety relation relating k and k (and we write k k ). There can be many possible abstractions according to their size and accuracy. Network fusion is generic w.r.t. abstraction as long as the Aeg respect two key properties: safety and faithfulness. To be safe, the initial abstract conguration of an Aeg must safely approximate the initial concrete conguration. Furthermore, if a conguration 1 is safely approximated by 1 and the network evolves in the conguration 2 , then there exists a transition from 1 to 2 in the Aeg such that 2 is safely approximated by 2 . These two points ensure that any execution trace of the Kpn is safely simulated by one in the Aeg. Formally: Definition 3.1. [Safety] approximation of k i Let k k , then k is a safe
I
f !x
(p0 , c0 ) f [0, +[
G G G ab|?b G G G G G
a<b|f ?a G
x:=x+1
(p1 , c0 ) f [0, +[
o!(a+b)
q
f !x
G G G G G G G5
G G G ab|?b G G G G G
a<b|f ?a G o!(a+b)
(p0 , c1 ) f [0, +[
G G G G G G G5
(p1 , c1 ) f [0, +[
x:=x+1
0 0 1 1 1 k 2 2 . 2 2 1 k 2 A key property of safe abstractions is that they preserve fairness. Of course, since they are upper approximations they include false paths (abstract traces whose concretization is empty). However, for abstract traces representing feasible concrete traces, fair abstract traces represent fair concrete traces. Safety also implies that all fair concrete traces are represented by fair abstract traces. An Aeg is said to be faithful if each abstract transition corresponds to a concrete transition modulo the nonsatisability of guards or a blocking read. In other words, faithfulness connes approximations to values. A false path can only be an abstract trace with a transition whose concrete image would be a transition with a false guard or a blocking read. Formally: Definition 3.2. [Faithfulness] a faithful approximation of k i 1 k 2 1 1
c c c
Figure 3: Example of an Aeg information). More precise Aegs could be designed (see Section 5.2). An Aeg bears enough information to be translated back into a program. Commands (guards and actions) label edges and nodes represent labels. The concretization of nite Lts k = (k , 0 , Ek , k ) into a program is formalized by the following straightforward translation: Concretization(k ) = {l : c ; l | 1 k 2 }
1 2
An important property of safe and faithful abstractions is that their concretization has the same semantics as the network they approximate. Property 3.3. If k is a safe and faithful approximation of k then Traces(k) = Traces(Concretization(k ))
Let k k , then k is
c
4.
FUSION
Faithfulness rules out, for instance, the (highly imprecise but safe) abstraction made of a unique abstract state representing all concrete states. In practice, the same abstract state cannot represent dierent program points (label congurations). Example. In order to provide some intuition we give here a crude but straightforward abstraction: Each process state is abstracted into the program point it is associated to. So, variables are not taken into account and process stores are completely abstracted away, Each internal channel state is represented by an interval approximating the length of its le. It is the control ow graph of the Kpn where each node holds a collection of intervals approximating the lengths of internal channels at the conguration of program points the node represents. The Aeg for our running example is given in Figure 3. In this particular example, the state of f is always approximated by the interval [0, +[ (the most imprecise
The user can specify scheduling constraints dening a subset of execution traces. Constraints impose implementation choices; they serve to guide and to optimize the fusion process. Constraints respect the black box nature of components. They are expressed w.r.t. IO operations, liveness properties or sizes of les. When constraints completely sequentialize the execution (no choice remains), they specify a scheduler. In general, however, constraints are incomplete and leave implementation choices.
4.1
Scheduling Constraints
We specify constraints by nite state Lts labeled with guarded actions. Of course, a more user-friendly language for declaring constraints should be studied but this is not the purpose of this article. The formalism used in Sections 2 and 3 is also well-suited to expressing constraints. We enrich the language of guards with two additional constructs dedicated to the expression of scheduling strategies: gc ::= f k | Bp | g where is any comparison operator
The size of a channel can be compared against an integer. For instance, f < 5 is true if the le f has less than 5 elements. The guard Bp is true if the process p is blocked
f!
G 0 t h
qpih defg
@
f?
f!
qpih defg
G 0 t h
qpih defg
@
f?
qpih defg
(f !)P
(f ?)C
f ! f ?
f ! f ?
qpih defg
(f ?)C
f =k1|f !
G 0 t h
qpih defg
@
f =1|f ?
BC |f !
qpih defg
BC |(f !)P
(f !)P f < k 1 | f!
(f ?)C f > 1 | f?
f?
qpih defg
qpih defg
1 i
f!
(f !)P
Figure 4: Examples of Scheduling Constraints (by a read on an empty channel or by other scheduling constraints). Constraints are more easily specied using sets of actions. We use the following notations: A ::= where represents any action of the network, | [f ]? | [f ]! | A | A1 A2 | Ap The constraint FB is a generalization of FF to a le f with k places (i.e., P writes k times before the control is passed to C). This is the formalization of the extension of lter fusion proposed in [3] A demand driven strategy is specied by DD . The consumer C is executed until it blocks i.e., is about to read the empty channel f . Then, P is executed until it produces a value in f . The control is passed to C which immediately reads f and continues. These constraints can be applied to any network as long as it has two components P and C connected at least with a channel f . Of course, constraints can be specied for any number of components and channels.
? (resp. !) represents any read (resp. write) and f ? (resp. f !) any read (resp. write) on le f , A is the complementary set of A, A1 A2 is the intersection of the sets A1 and A2 , Ap represents the projection of the set of actions A onto the commands of the component p. For instance, (?)p represents all non-read actions of component p. Using sets is just more convenient; constraints can be automatically translated into a standard Lts labeled by standard commands afterward. Figure 4 gathers a few examples of constraints for a network with (at least) two components P (writing a le f ) and C (reading the le f ). The constraint FF summarizes in a small automaton the strategy used by Filter Fusion [11]. The producer P starts until it writes on f ? The control is passed to the consumer C until it reads f and so on. This strategy bounds the size of the fo f to be at most 1 and therefore it may introduce articial deadlocks from some networks. FF sequentializes completely the execution of P and C (no scheduling choice remains). The constraint IS is similar to FF except that both P and C can be executed between writes and reads on f . IS leaves some scheduling choices.
4.2
Enforcing Constraints
Enforcing a constraint = ( , 0 , E , ) to an Aeg k = (k , 0 , Ek , k ) can be expressed as a parallel composition (k ). This operation can be dened formally as follows. We assume that all shorthands (like (?)p ) used in constraints are replaced by the actions of the Aeg they represent. k with k ( , )
gg |a k g|a
= (k , (0 , 0 ), Ek , k )
g |a
( , )
k
c
a k \
( , ) k ( , )
If an action a is taken into account by the constraints, the execution can proceed only if both Lts can execute a (i.e., they can both execute commands made of a and a true guard). The actions not taken into account by the constraints can be executed independently whenever possible. Constraints do not introduce new actions (E Ek ). To simplify the
presentation, we assumed in the above inference rules that the guards did not use the condition Bp . We now present the rule corresponding to this condition in isolation. The Bp construct serves to pass the control to another component when one is blocked. The condition Bp is easily dened w.r.t. Kpns: p is blocked in conguration if there is no outgoing transition labeled with a command of p. However, Aegs are approximations with false paths; a component p can be blocked even if the corresponding abstract state has outgoing transitions labeled with commands of p. Actually, p is blocked in an abstract state if any outgoing p transition has either a false guard or is a read on an empty channel (i.e., is not enabled). Formally, let c1 , . . . , cn all the ci commands of p such that k i (ci|g ) f = 0 if ci|a = f ?x and gi = (ci|g ) otherwise then the necessary and sucient condition for p to be blocked in is bp ( ) =
i=1,...,n
(0 , p0 , c0 ) f [0, 0]
x:=x+1
(0 , p1 , c0 ) f [0, 0]
f !x
(1 , p0 , c0 ) f [1, 1]
l
o!(a+b)
a<b|f ?a
(0 , p0 , c1 ) f [0, 0]
ab|?b
MM MM MM
x:=x+1 MM
gi
MM MM 8
q q q q q q
f !x
q q q q
D (1 , p0 , c1 ) f [1, 1] qV q
(0 , p1 , c1 ) f [0, 0]
( , )
Bp |a
gbp ( )|a k
4.3
Scheduling
Figure 5 represents the product of the Aeg of Figure 3 with FF . The component P is executed until it produces a value on f then C is executed until it reads a value on f . Note that if a b remains always true then P will never be executed. So, the execution is not fair but it is nevertheless correct and yields the same output as the network (P is never executed only when its production is not needed). The strategy does not use guards, so no new test appears in the constrained Aeg. The result is completely sequentialized. After the constrained Aeg is produced, the size of les is reestimated using standard static analysis techniques. We have indicated in Figure 5 the new approximations for f . They show that the Aeg is now bounded (the size of f is at most 1). It is easy to check that the Aeg can be translated by Concretization (see Sec. 3) into a sequential program. As already mentioned, one goal of fusion is to suppress internal communications. For unbounded Aeg, internal reads and writes are replaced by assignments to lists or fos. Here, the channel f can be implemented by a single variable vf and writes f !x and reads f ?a by assignments vf := x and x := vf . These assignments can then be suppressed using standard optimization techniques [1]. Finally, after a renaming of labels, we get: ; pc1 ; pc0 : x := x + 1 pc1 : a < b | a := x ; pc2 ; pc1 : a b | ?b ; pc3 ; PC = pc2 : x := x + 1 ; pc3 ; pc3 : o!a + b ; pc1 ; In this section, we have presented the parallel composition as a fairly standard automata product. Depending on the size of the Lts, this may cause an unacceptable state explosion. We present a solution to this problem in Section 5.3.
In general, constraints enforcement leaves implementation choices which must be taken to produce a sequential program. The fusion process makes these choices automatically by scheduling the execution of components. A valid schedule must be fair (all enabled components must be eventually executed) and sequential (the scheduled execution must correspond to a sequential program). We choose here a simple and fair policy: round-robin scheduling. Components are ordered in a circular queue and the scheduler activates them in turn. Either the current active component is blocked (by a read or a user dened constraint) either one of its command is executed. In both cases, the control is passed to the next component. Figure 6 formalizes round-robin for networks with two components P and C as a guarded Lts. It would be easy to generalize such a round-robin Lts for any network with a xed number of processes.
BP BC | C
VWXY cba`
BP | P
0 h
BC | C
VWXY cba`
BC BP | P
BC BP |deadlock
BC BP |deadlock
VWXY cba`
Figure 6: Round-Robin Scheduling The schedule is fair and ensures a complete serialization
of the execution. It starts by enforcing the execution of one instruction of P , then one instruction of C and so on. If one of the two processes is blocked at its turn, then an instruction of the other process is executed instead. When both processes are blocked then it is a global deadlock denoted by the special instruction deadlock. Constrained Aegs are composed in parallel with the automaton of gure 6 to obtain sequential programs. The composition is the same as before except that the deadlock action does not belong to the set of actions of components. The product will therefore introduce a new deadlock transition along with a new state in the Aeg. This new transition, which detects a global deadlock, will be implemented by printing an error message and terminating the program. When such a transition appears in the result of fusion, the user is warned of a possibility of deadlock.
scheduling strategy. More sophisticated policies (e.g., using several queues, based on static or dynamic priorities, etc.) could be considered as well. As in Section 4.2, we have presented scheduling as an Lts product; scheduling could also be implemented by the technique outlined in Section 5.3.
4.4
Semantic Issues
User-dened constraints can change the semantics of the Kpn. For example, a constraint which bounds communication channels would cause an articial deadlock into an unbounded Kpn. The user may want to enforce properties even at the price of deadlocks. We consider that a change of semantics is acceptable as long as it depends on the user and remains under her or his control. However, this requires to restrict the class of acceptable constraints. Considet the following constraint:
(0 , p0 , c0 ) f [0, +[
o!(a+b)
(1 , p1 , c0 ) f [0, +[
qpih defg
x:=x+1
f !x
| a | |
~||
B B b B B B
(1 , p1 , c0 ) f [0, +[
ab|?b
1
(1 , p1 , c0 ) f [0, +[
a<b|f !x
(1 , p0 , c0 ) f [0, +[
f !x
(0 , p1 , c0 ) f [0, +[
a<b|f ?a
ab|?b
o!(a+b)
(0 , p0 , c1 ) f [0, +[
x:=x+1
(1 , p1 , c1 ) f [0, +[
Figure 7: Sequentialization with Round-Robin Let us consider the scheduling of the original Aeg of Figure 3. This situation would arise if the user does not provide any constraint. The Aeg obtained after product (and simplications) is given in Fig. 7. Simplications are needed since the product of transitions guarded by BX produces many dummy transitions (i.e., with false guards). The process P is never blocked (it never reads), so the execution can start by x := x + 1. The execution must proceed by C if it is not blocked (BC ). There are two cases: either a b and C is not blocked and its action (?b) can be executed, either a < b and C is blocked by a read of the empty le f . In the latter case, round-robin scheduling passes the control to P and executes the action f !x. The transition a < b | f !x corresponds to if C is blocked then execute the next P s command. We do not describe any further the product which proceeds similarly. Contrary to the product with FF , the result is fair but unbounded: the data produced by P may accumulate in the channel f without bounds. The correctness of the scheduling process comes from the fact that the product with RR yields a sequential, fair and faithful Aeg. Note that network fusion is generic w.r.t. the
where a and b are two non-mutually exclusive commands and 1 and 2 represents distinct constraints. Since a and b can be executed indierently, a choice will be made by scheduling. However, depending on this choice, the constraints that are enforced afterwards (1 or 2 ) are different. For example, 1 may imply an articial deadlock in some (non statically determined) cases and 2 in some other cases. In other words, the semantics of the resulting program will depend on a blind choice made by fusion. This semantic change is not acceptable since it would be out of the control of the user. Our solution is to restrict the class of acceptable constraints. Namely, if a constraint leaves a non deterministic choice such as a or b above then the constraints must ensure that all processes can still evolve in the same way (an articial deadlock in one side implies that we have the same articial deadlock on the other side). Each choice a or b correspond to set (language) of acceptable traces (a.L(1 ) and b.L(2 )). A constraint is acceptable if for each choice, the projection of the corresponding languages to the commands of any process are equivalent. For the above example we must enforce that p. a.L(1 ) p = b.L(2 ) p With this condition, the choices made by the scheduling step do not have any semantic impact. All the constraints of Figure 4 are acceptable. It is obvious for FF , FB , DD since they do not leave any non deterministic choice. In IS , the two transitions labeled by f ! f ? leave the choice between executing P or C. However, in both states, they lead to the same state (and therefore accept the same language).
5.
EXTENSIONS
The preceding sections have presented the main ideas of network fusion. We hint here at three ways of extending the basic technique: providing more linguistic support to the
user, working on more precise abstractions, avoiding products between Lts. These three extensions all aim at getting more ecient fused programs.
ab|?b
(p0 , c0 ) f [0, 0]
(p0 , c1 ) f [0, 0]
5.1
Linguistic support
x:=x+1 o!(a+b)
(p1 , c0 ) f [0, 0]
Scheduling constraints allow users to control network fusion. Other linguistic support could be provided to users as well. We focus here on special commands allowing to alleviate the false path problem. False paths arise when data depending controls are abstracted by non deterministic choices [2]. This standard approximation makes fusion consider infeasible paths and spurious deadlocks. The left part of gure 8 shows a simple but characteristic example of the problem. The process P begins by sending on channel ct the number of items it will produce on channel dt. Then, P and C respectively writes and reads the same number of items on dt (M = N ). However, this information is lost in the Aeg which abstracts away values. The fusion process must therefore consider the case where P produces not enough values and C is blocked and also the case where P produces unconsumed values and the size of dt cannot be bounded. This problem can be alleviated using commands making synchronization or termination explicit. The languages of actions and guards are extended with the following constructs: a ::= g ::= wait(f ) | proceed(f ) | . . . waiting?(f ) | . . .
(p1 , c1 ) f [0, 0]
o!(a+b) f !x ab|?b
f !x
(p0 , c1 ) f [0, 0]
a<b|f ?a
(p0 , c0 ) f [1, 1]
(p0 , c1 ) f [1, 1]
o!(a+b) x:=x+1
x:=x+1
x:=x+1
Figure 9: Aeg with irrelevance criterion (excerpt) present here a new abstraction aimed at nding bounded schedules when they exist. A bounded schedule ensures that the size of fo les remain bounded throughout the execution. In this case, fo can implemented (after fusion) by local variables (instead of dynamically allocated data structures). Furthermore, when the precise size of a channel is known there is no need to test for its emptiness before reading it. In some contexts, such as embedded systems, it is crucial to nd bounded schedules and lot of work has been devoted to this issue ([10], [5], [12]). In the context of Petri Nets, Cortadella et al. presented a new criterion which limits the search state for schedules [5] . They conjectured that if a bounded schedule exists then it will be found in the delimited search space. We can adapt the criterion to our context to produce abstractions suited to the discovery of bounded schedule. The idea is to have precise abstract states associating an integer to each size (not an interval anymore). The same set of control points (e.g., (p0 , c1 )) may appear several times in the Aeg with dierent sizes of les. The niteness of the Aeg is ensured by the irrelevance criterion. A state s2 is said irrelevant if the Aeg contains another state s1 such that: s2 is reachable from s1 . all fo have their size in s2 greater or equal than their size in s1 each fo whose size is greater in s2 than in s1 , has a non-zero size in s1 . The idea behind this criterion is that a irrelevant state cannot enable any new action (e.g., it does not enable a blocked read). It is no use to continue unfolding the graph. It can be closed using a state where the size of channels are approximated by [0; +[. A small part of the Aeg obtained using the irrelevance criterion on our running example is shown on gure 9. A bounded schedule can be found in the
The commands wait(f ) and proceed(f ) permit to express a rendez-vous between the producer and the consumer of a le f . The producer blocks on wait(f ) until the consumer emits proceed(f ). The predicate waiting?(f ) evaluates to true if the producer is waiting on wait(f ) and all data has been consumed on f . It evaluates to false if there is some available data. It blocks if there is no available data and the producer is not waiting. These instructions are just syntactic sugar and waiting?(f ) do not aect the determinism of Kpns. They could be implemented by writing/reading a special value on an additional channel. On the other hand, they do provide more information to the fusion process and permit to avoid false paths. The right part of Figure 8 shows how to take prot of these instructions on the previous example. Instead of communicating via ct the number of items written on dt, P nishes its emission by waiting to C. The consumer reads until it has consumed all data produced by P ; it then releases P and both processes may proceed. Such explicit rendez-vous can be taken into account by the abstraction step to avoid the problematic false paths mentioned above. Others instructions could be considered as well. For example, an instruction close(f ), indicating that a process will not write or read on f anymore, would also be useful.
5.2
In section 2, we presented an abstraction representing the control ow graph of the Kpn. This abstraction gives a very imprecise approximation for the size of le f at each state ([0; +[). As long as they respect the safety and faithfulness properties, many other abstractions could be used. We
Synchronization using data values l0 l1 l2 P = l3 l 3 l4 l0 l1 l2 l3 C= l3 l 4 l5 : : : : : : : : : : : : : ?N ct!N i := 0 i<N |i=i+1 i N | skip dt!x o!0 ct?M j := 0 j <M |j =j+1 j M | skip dt?y o!y ; ; ; ; ; ; ; ; ; ; ; ; ; l1 ; l2 ; l3 ; l4 ; l0 ; l3 l1 ; l2 ; l3 ; l4 ; l0 ; l5 ; l3
Synchronisation using linguistic extensions l0 l1 l2 P = l3 l 3 l4 l0 l1 C= l1 l 2 : : : : : : : : : : ?N ct!N i := 0 i<N |i=i+1 i N | wait(dt) dt!x o!0 waiting?(dt) | dt?y waiting?(dt) | proceed(dt) o!y ; ; ; ; ; ; ; ; ; ; l1 ; l2 ; l3 ; l4 ; l0 ; l3 l1 ; l3 ; l0 ; l1
Figure 8: A false path problem (left) and its solution (right) part shown. This improved precision allows to nd bounded schedules automatically; it also involves larger Aegs. each if and while command. This technique is easily extended to guarded Lts. Figure 10 represents the result of the instrumented product between the Aeg of Figure 3 and FF . It has the same number of states as the original Aeg. On the other hand, instructions (l := {0, 1}) and tests (l = {0, 1}) have been inserted to encode the state transitions of FF . Compared to the Lts of Figure 5 which represents the standard synchronized product, some states like (0 , p0 , c0 ) and (1 , p0 , c0 ) are now merged into a single state (p0 , c0 ). Transitions from this state must now test whether underlying Lts FF is in the state 0 or 1 . On this example, the smaller number of states is certainly not worth the overhead. In general, however, instrumented product is at most linear in size whereas synchronized product may entail a quadratic blowup. A small time overhead is preferable to a space explosion. In any case, the user should be given the opportunity to specify on which Lts (or on which parts of a Lts) using standard or instrumented product.
5.3
Instrumented Product
S
f !x ; l:=1
l = 0 | x := x + 1
(p0 , c0 ) f [0, 1]
GG GG l GG G
= 1a b | ?b
l = 1a < b | f ?a ; l := 0 G l = 1 | o!(a + b)
(p1 , c0 ) f [0, 0]
GG GG GG 5
S
f !x ; l:=1
l = 0 | x := x + 1
(p0 , c1 ) f [0, 1]
(p1 , c1 ) f [0, 0]
6.
Figure 10: Instrumented product with F F Two steps of network fusion are described as a synchronized product between guarded Lts. Obviously, in some cases, this could cause a state explosion and produce too large programs. A solution to avoid this space problem is to implement the product by instrumenting the Aeg. The Lts representing the constraints or the schedule is taken into account by the Aeg by introducing a variable (to represent the state of the Lts) and new instructions (to represent state transitions). We have used such a technique in [4] to enforce safety properties (expressed as nite state automata) on programs. We have shown that the instrumentation can be made very ecient using simple techniques (specialization, minimization and reachability analysis). This instrumented product introduces at worst an assignment (a state transition) at
CONCLUSION
One of our starting points was Filter Fusion [11], a simple algorithm to merge a producer connected by a single channel to a consumer. Filter Fusion is restricted to very specic networks (pipelines) and to a xed strategy. Our work can be seen as a formalization of lter fusion using synchronized product and as well as a generalization to arbitrary networks and user-dened strategies. The application of our technique on pipelined lters with the constraint FF (Figure 4) is equivalent to lter fusion. The extension of lter fusion with a more sophisticated scheduling strategy proposed in [3] is formalized in our framework by the scheduling constraints FB (Figure 4). Some functional program transformations bear similarities with fusion. As fusion aims at removing values produced on channels by the composition of components, Listlessness [14] and deforestation [15] aim at removing the intermediate data structures produced by the composition of functions. As lter fusion, these transformations consider producerconsumer pairs and have a xed fusion strategy.
The area of embedded/reactive systems has produced a large body of work on static scheduling. Lin [8] studies the static scheduling of synchronously communicating processes. Cortadella et al. [5] and Strehl et al. [12] consider scheduling of asynchronous process networks. They all use petri nets as their underlying formalism. Like Parks [10], they focus on bounded scheduling and do not consider user-dened constraints even if some integrate a form of fusion. Strehl et al. [13] propose a design model that permits the specication of components and scheduling constraints. They derive a scheduler but do not consider fusion. We have presented a generic and exible framework for merging networks of interacting components. It is based on guarded labeled transition systems, synchronized product, static analysis and transformation techniques. Fusion can be applied to a large class of networks (Kpns) and can take into account user-dened scheduling constraints. The technique can be parameterized by dierent abstractions, constraints and scheduling strategies. Still, a lot of work remains to be done. We are currently working on an extended version including a complete formalization and correctness proofs. A prototype needs to be implemented in order to validate the approach experimentally. We expect that large programs can be abstracted into small automata since fusion focuses on I/O instructions (blocks of internal instructions can be represented by a single action). Along with the use of instrumented product in problematic cases, we are condent that ecient and reasonable sized programs can be produced. More generally we see network fusion as part of a more general framework to assemble and fuse components. A rst feature of such a framework would consist in an architecture description language to specify the assembly (i.e., the ports and their connections). Another useful feature would be the ability to specify the synchronization instructions. They do not have to be IO instructions as supposed previously. By considering some actions of two components as IO operations on a (conceptual) channel f (i.e., f !x and f ?x), it becomes possible to impose constraints on their interleaving.
[6] M. Geilen and T. Basten. Requirements on the execution of Kahn process networks. In European Symposium on Programming (ESOP03), 2003. [7] G. Kahn. The semantics of a simple language for parallel programming. In Proceedings of the IFIP Congress (Information Processing74), pages 471475, 1974. [8] B. Lin. Software synthesis of process-based concurrent programs. In Proceedings of the 1998 Conference on Design Automation (DAC-98), pages 502505, 1998. [9] F. Nielson, H. R. Nielson, and C. L. Hankin. Principles of Program Analysis. Springer-Verlag, 1999. [10] T. M. Parks. Bounded scheduling of process networks. PhD thesis, University of California, Berkeley, 1995. [11] T. A. Proebsting and S. A. Watterson. Filter fusion. In Symposium on Principles of Programming Languages (POPL96), pages 119130, 1996. [12] K. Strehl, L. Thiele, D. Ziegenbein, and R. Ernst. Scheduling hardware/software systems using symbolic techniques. In Proceedings of the seventh international workshop on Hardware/software codesign (CODES 99), pages 173177, 1999. [13] L. Thiele, K. Strehl, D. Ziegenbein, R. Ernst, and J. Teich. Funstate - an internal design representation for codesign. In International Conference on Computer-Aided Design (ICCAD 99), pages 558565, 1999. [14] P. Wadler. Listlessness is better than laziness. In Conference Record of the 1984 ACM Symposium on Lisp and Functional Programming, pages 4552, 1984. [15] P. Wadler. Deforestation: transforming programs to eliminate trees. Theoretical Computer Science, 73(2):231248, 1990.
7.
REFERENCES
[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers. Principles, Techniques, and Tools. Addison-Wesley, 1986. [2] Arrigoni, Duchini, and Lavagno. False path elimination in quasi-static scheduling. In Automation and Test in Europe Conference and Exhibition (DATE02), pages 964970, 2002. [3] R. Clayton and K. Calvert. Augmenting the proebsting-watterson lter fusion algorithm, 1997. http://citeseer.ist.psu.edu/186288.html. [4] T. Colcombet and P. Fradet. Enforcing trace properties by program transformation. In Symposium on Principles of Programming Languages (POPL00), pages 5466, 2000. [5] J. Cortadella, A. Kondratyev, L. Lavagno, M. Massot, S. Moral, C. Passerone, Y. Watanabe, and A. L. Sangiovanni-Vincentelli. Task generation and compile-time scheduling for mixed data-control embedded software. Technical Report LSI-99-47-R, Dept. of Software, Universitat Politecnica de Catalunya, 1999.