Robert M. Keller

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 15

Example:

Testing whether a graph is acyclic

by Robert M. Keller

What we want to do here are two things:

Get an idea of an informal algorithm for testing graphs.

Show how to use our information structure list representations


and functions to determine whether a graph is acyclic.

Given some representation of a directed graph, we might like


to know whether there are any cycles (loops from a node
back to itself, possibly through other nodes).

A graph that has at least one such loop is called cyclic, and one
which doesn't is called acyclic. Acylic directed graphs are
also called dags.

An acylic graph:
A similar-appearing cylic graph:

Idea:

If a graph is acyclic, then it must have at least one node with


no targets (called a leaf).

For example, in
node 3 is such a node. There in general may be other nodes,
but in this case it is the only one.

This condition (having a leaf) is necessary for the graph to be


acyclic, but it isn't sufficient. If it were, the problem would
be trivial.

For example, the preceding cyclic graph had a leaf (3):

Continuation of the idea:

If we "peel off" a leaf node in an acyclic graph, then we are


always left with an acyclic graph.

If we keep peeling off leaf nodes, one of two things will


happen:

We will eventually peel off all nodes: The graph is acyclic.


OR

We will get to a point where there is no leaf, yet the graph is


not empty: The graph is cyclic.

An informal statement of the algorithm is as follows:

To test a graph for being acyclic:

1. If the graph has no nodes, stop. The graph is acyclic.

2. If the graph has no leaf, stop. The graph is cyclic.

3. Choose a leaf of the graph. Remove this leaf and all arcs
going into the leaf to get a new graph.

4. Go to 1.

The partial correctness of the algorithm is based on the ideas


which led to it.

We can see that this algorithm must terminate as follows:


Each time we go from 4 to 1, we do so with a graph which has
one fewer node.

Thus, in a number of steps at most equal to the number of


nodes in the original graph, the algorithm must terminate.

Partial correctness is a technical term: It means that if the


algorithm terminates, it does so with the correct answer.

How can we represent this algorithm in terms of information


structures?

Let's choose the list-of-arcs representation for the graph for


simplicity.

Recall that for the following graph

the representation would be:

[ [1, 2], [2, 3], [2, 4], [4, 5], [6, 3], [4, 6], [5, 6] ]
First we have to find whether there is a leaf. By definition, a
leaf is a node with no arcs leaving it.

The anonymous function

(Pair) => first(Pair) == Node

returns 1 for any Pair having Node as its first element.

The reading of this expression is "the function which, with


argument Pair, returns the result of first(Pair) == Node.
== is the equality test.

We embed this anonymous function in a call of the rex function


find, to give a function is_leaf which will determine
whether Node is a leaf:

is_leaf(Node, Graph) =

no((Pair) => first(Pair) == Node, Graph);

Here Graph is our list of arcs. If the call to no returns 1 if, and
only if, there is no arc with Node as its first element.

Now we have a leaf test. So we next need to use it.

Let's assume that we have a list of the nodes by themselves.

We can test whether any of these is a leaf, and if so, return the
identity of the leaf, by:
find_leaf(Graph) =

find((Node) => is_leaf(Node, Graph), nodes(Graph));

If find_leaf returns [ ], there is no leaf. If it returns a non-


empty list, then the first element in that list is a leaf.

We can also create one more descriptive function:

no_leaf(Graph) = find_leaf(Graph) == [ ];

How to get the list of nodes:

Think of the list of pairs as a (very bushy) tree.

Example:

For the list

[ [1, 2], [2, 3], [2, 4], [4, 5], [6, 3], [4, 6], [5, 6] ]

the bushy tree is:


The rex function leaves acting on this tree will return a list of
all nodes, possibly with duplicates. By applying
remove_duplicates to this list, we will get a list of the
nodes:

nodes(Graph) =

remove_duplicates(leaves(Graph));

Caution: The "leaves" of this tree aren't only the leaf nodes of
the original graph; they include all the nodes, as desired.

Now we are ready to cast the steps of the algorithm in terms


of functions.

To start, let Graph be the original graph (as a list of pairs).

1. If the Graph has no nodes, stop. The original graph is


acyclic.

We can test this by checking whether Graph is [ ]. If it has no


nodes, it has no arcs either, and vice-versa.

2. If the graph has no leaf, stop. The graph is cyclic.


We can test this by computing no_leaf(Graph). If the result is
[ ], the graph has no leaf.

3. Choose a leaf of Graph. Remove this leaf and all arcs going
into the leaf to get a new graph.

We need one more function: remove_leaf to remove a leaf from


a graph.

4. Go to 1.

The spirit of functional programming is that we don't actually


remove something from a list; instead we build a new list
without the thing to be removed in it.

Suppose Leaf is the leaf to be removed. Then we need only


drop all arcs with Leaf as its second element (it will never
be a first element; why?):

remove_leaf(Leaf, Graph) =

drop((Pair) => second(Pair) == Leaf, Graph);

As usual,

(Pair) => second(Pair) == Leaf


is an anonymous function which is 1 for a pair with second
element being a leaf.

We can simplify this function, provided we only call it knowing


there is a leaf:

remove_leaf(Graph) =

remove_leaf(first(find_leaf(Graph)), Graph);

Finally, we need to package the calls to these functions in such


a way that iteration is achieved. This is simple, if we use
conditional expressions:

The form

P?T:F

evaluates to the value of T if P is true and to the value of F if P


is false.

The overall function for testing acyclicity uses two conditional


expressions:

acyclic(Graph) =

Graph == [ ] ? 1 // empty graph is acyclic

: no_leaf(Graph) ? 0 // graph with no leaf is cyclic

: acyclic(remove_leaf(Graph)); // try reduced graph

A complete file acyclic.rex with comments and two test cases


maybe found as acylic.rex. This may be run in rex by

rex acyclic.rex
Once the file is loaded, you may try any of the examples
individually. The two graphs used as examples here are
examples graph1 and graph2 in the file.

Disclaimer: We make no statement that this method is the


most efficient. Other more efficient ways to achieve the
same result are known. This example served to illustrate
several functional programming ideas and the progression
from an informal algorithm to a working functional
program.

An example of a directed acyclic graph

In mathematics and computer science, a directed acyclic graph (commonly abbreviated


to DAG), is a directed graph with no directed cycles. That is, it is formed by a collection
of vertices and directed edges, each edge connecting one vertex to another, such that
there is no way to start at some vertex v and follow a sequence of edges that eventually
loops back to v again.[1][2][3]

DAGs may be used to model several different kinds of structure in mathematics and
computer science. The reachability relation in a DAG forms a partial order, and any finite
partial order may be represented by a DAG using reachability. DAGs may be used to
model processes in which information flows in a consistent direction through a network
of processors. Additionally, DAGs may be used as a space-efficient representation of a
collection of sequences with overlapping subsequences.

The corresponding concept for undirected graphs is a forest, an undirected graph without
cycles. Choosing an orientation for a forest produces a special kind of directed acyclic
graph called a polytree. However there are many other kinds of directed acyclic graph
that are not formed by orienting the edges of an undirected acyclic graph. For this reason
it may be more accurate to call directed acyclic graphs acyclic directed graphs or
acyclic digraphs.

Partial orders and topological ordering

A Hasse diagram representing the partial order ⊆ among the subsets of a three-element
set.

Each directed acyclic graph gives rise to a partial order ≤ on its vertices, where u ≤ v
exactly when there exists a directed path from u to v in the DAG. However, many
different DAGs may give rise to this same reachability relation: for example, the DAG
with two edges a → b and b → c has the same reachability as the graph with three edges
a → b, b → c, and a → c. If G is a DAG, its transitive reduction is the graph with the
fewest edges that represents the same reachability as G, and its transitive closure is the
graph with the most edges that represents the same reachability.

The transitive closure of G has an edge u → v for every related pair u ≤ v of distinct
elements in the reachability relation of G, and may therefore be thought of as a direct
translation of the reachability relation ≤ into graph-theoretic terms: every partially
ordered set may be translated into a DAG in this way. If a DAG G represents a partial
order ≤, then the transitive reduction of G is a subgraph of G with an edge u → v for
every pair in the covering relation of ≤; transitive reductions are useful in visualizing the
partial orders they represent, because that have fewer edges than other graphs
representing the same orders and therefore lead to simpler graph drawings. A Hasse
diagram of a partial order is a drawing of the transitive reduction in which the orientation
of each edge is shown by placing the starting vertex of the edge in a lower position than
its ending vertex.

Every directed acyclic graph has a topological ordering, an ordering of the vertices such
that the starting endpoint of every edge occurs earlier in the ordering than the ending
endpoint of the edge. In general, this ordering is not unique; a DAG has a unique
topological ordering if and only if it has a directed path containing all the vertices, in
which case the ordering is the same as the order in which the vertices appear in the path.
The family of topological orderings of a DAG is the same as the family of linear
extensions of the reachability relation for the DAG, so any two graphs representing the
same partial order have the same set of topological orders. Topological sorting is the
algorithmic problem of finding topological orderings; it can be solved in linear time. It is
also possible to check whether a given directed graph is a DAG in linear time, by
attempting to find a topological ordering and then testing whether the resulting ordering
is valid.

Some algorithms become simpler when used on DAGs instead of general graphs, based
on the principle of topological ordering. For example, it is possible to find shortest paths
and longest paths from a given starting vertex in DAGs in linear time by processing the
vertices in a topological order, and calculating the path length for each vertex to be the
minimum or maximum length obtained via any of its incoming edges. In contrast, for
arbitrary graphs the shortest path may require slower algorithms such as Dijkstra's
algorithm or the Bellman-Ford algorithm, and longest paths in arbitrary graphs are NP-
hard to find.

DAG representations of partial orderings have many applications in scheduling problems


for systems of tasks with ordering constraints. For instance, a DAG may be used to
describe the dependencies between cells of a spreadsheet: if one cell is computed by a
formula involving the value of a second cell, draw a DAG edge from the second cell to
the first one. If the input values to the spreadsheet change, all of the remaining values of
the spreadsheet may be recomputed with a single evaluation per cell, by topologically
ordering the cells and re-evaluating each cell in this order. Similar problems of task
ordering arise in makefiles for program compilation, instruction scheduling for low-level
computer program optimization, and PERT scheduling for management of large human
projects. Dependency graphs without circular dependencies form directed acyclic graphs.

Data processing networks


A directed graph may be used to represent a network of processing elements; in this
formulation, data enters a processing element through its incoming edges and leaves the
element through its outgoing edges. Examples of this include the following:

• In electronic circuit design, a combinational logic circuit is an acyclic system of


logic gates that computes a function of an input, where the input and output of the
function are represented as individual bits.
• A Bayesian network represents a system of probabilistic events as nodes in a
directed acyclic graph. The likelihood of an event may be calculated from the
likelihoods of its predecessors in the DAG. In this context, the moral graph of a
DAG is the undirected graph created by adding an (undirected) edge between all
parents of the same node (sometimes called marrying), and then replacing all
directed edges by undirected edges.
• Dataflow programming languages describe systems of values that are related to
each other by a directed acyclic graph. When one value changes, its successors
are recalculated; each value is evaluated as a function of its predecessors in the
DAG.
• In compilers, straight line code (that is, sequences of statements without loops or
conditional branches) may be represented by a DAG describing the inputs and
outputs of each of the arithmetic operations performed within the code; this
representation allows the compiler to perform common subexpression elimination
efficiently.

Paths with shared structure


A third type of application of directed acyclic graphs arises in representing a set of
sequences as paths in a graph. For example, the directed acyclic word graph is a data
structure in computer science formed by a directed acyclic graph with a single source and
with edges labeled by letters or symbols; the paths from the source to the sinks in this
graph represent a set of strings, such as English words. Any set of sequences can be
represented as paths in a tree, by forming a tree node for every prefix of a sequence and
making the parent of one of these nodes represent the sequence with one fewer element;
the tree formed in this way for a set of strings is called a trie. A directed acyclic word
graph saves space over a trie by allowing paths to diverge and rejoin, so that a set of
words with the same possible suffixes can be represented by a single tree node.

The same idea of using a DAG to represent a family of paths occurs in the binary
decision diagram,[4][5] a DAG-based data structure for representing binary functions. In a
binary decision diagram, each non-sink vertex is labeled by the name of a binary variable,
and each sink and each edge is labeled by a 0 or 1. The function value for any truth
assignment to the variables is the value at the sink found by following a path, starting
from the single source vertex, that at each non-sink vertex follows the outgoing edge
labeled with the value of that vertex's variable. Just as directed acyclic word graphs can
be viewed as a compressed form of tries, binary decision diagrams can be viewed as
compressed forms of decision trees that save space by allowing paths to rejoin when they
agree on the results of all remaining decisions.

In many randomized algorithms in computational geometry, the algorithm maintains a


history DAG representing features of some geometric construction that have been
replaced by later finer-scale features; point location queries may be answered, as for the
above two data structures, by following paths in this DAG.

Relation to other kinds of graphs


A polytree is a directed graph formed by orienting the edges of a free tree. Every polytree
is a DAG. In particular, this is true of the arborescences formed by directing all edges
outwards from the root of a tree. A multitree (also called a strongly ambiguous graph or a
mangrove) is a directed graph in which there is at most one directed path (in either
direction) between any two nodes; equivalently, it is a DAG in which, for every node v,
the set of nodes reachable from v forms a tree.

Any undirected graph may be made into a DAG by choosing a total order for its vertices
and orienting every edge from the earlier endpoint in the order to the later endpoint.
However, different total orders may lead to the same acyclic orientation. The number of
acyclic orientations is equal to |χ(-1)|, where χ is the chromatic polynomial of the given
graph.[6]

Any directed graph may be made into a DAG by removing a feedback vertex set or a
feedback arc set. However, the smallest such set is NP-hard to find. An arbitrary directed
graph may also be transformed into a DAG, called its condensation, by contracting each
of its strongly connected components into a single supervertex.[7] When the graph is
already acyclic, its smallest feedback vertex sets and feedback arc sets are empty, and its
condensation is the graph itself.

Enumeration
The graph enumeration problem of counting directed acyclic graphs was studied by
Robinson (1973).[8] The number of DAGs on n labeled nodes, for n = 1, 2, 3, ..., is

1, 3, 25, 543, 29281, 3781503, ... (sequence A003024 in OEIS).

These numbers may be computed by the recurrence relation

[8]

Eric W. Weisstein conjectured,[9] and McKay et al. (2004) proved,[10] that the same
numbers count the (0,1) matrices in which all eigenvalues are positive real numbers. The
proof is bijective: a matrix A is an adjacency matrix of a DAG if and only if the
eigenvalues of the (0,1) matrix A + I are positive, where I denotes the identity matrix

You might also like