Chương 6
Chương 6
Chương 6
SATISFACTION PROBLEMS
In which we see how treating states as more than just little black boxes leads to
the invention of a range of powerful new search methods and a deeper
understanding of problem structure and complexity.
202
Section 6.1. Defining Constraint Satisfaction 203
Problems
to a CSP is a consistent, complete assignment. A
partial assignment is one that assigns values to
only some of the variables.
ASSIGNMENT
6.1.1 Example problem: Map coloring
CONSISTENT
COMPLETE ASSIGNMENT Suppose that, having tired of Romania, we are
SOLUTION
PARTIAL ASSIGNMENT
looking at a map of Australia showing each of its
states and territories (Figure 6.1(a)). We are
the domain {A,B}, then the constraint saying the
two variables must have different values can be given the task of coloring each region either red,
written as ⟨(X1, X2), [(A, B), (B, A)]⟩ or as ⟨(X1, green, or blue in such a way that no neighboring
X2), X1 =/ X2⟩. regions have the same color. To formulate this as
To solve a CSP, we need to define a state a CSP, we define the variables to be the regions
space and the notion of a solution. Each state in a X = {WA, NT, Q, NSW , V, SA,T } .
CSP is defined by an assignment of values to
some or all of the variables, {Xi = vi, Xj = vj,.. .}. The domain of each variable is the set Di = {red
An assignment that does not violate any , green, blue}. The constraints require
constraints is called a consistent or legal neighboring regions to have distinct colors.
assignment. A complete assignment is one in Since there are nine places where regions border,
which every variable is assigned, and a solution there are nine constraints:
C = {SA /= WA, SA /= NT, SA Q, SA /= NSW , SA /= V,
WA /= NT, NT /= Q, Q NSW , NSW /= V } .
Here we are using abbreviations; SA /= WA is a
shortcut for ⟨(SA, WA), SA /= WA⟩, where
SA /= WA can be fully enumerated in turn as
{(red , green), (red , blue), (green, red ),
(green, blue), (blue, red ), (blue, green)} .
CONSTRAINT GRAPH
There are many possible solutions to this problem,
such as
{WA = red , NT = green,Q = red , NSW =
green,V = red , SA = blue,T = red }.
It can be helpful to visualize a CSP as a constraint
graph, as shown in Figure 6.1(b). The nodes of the
graph correspond to variables of the problem, and
a link connects any two vari- ables that participate
in a constraint.
Why formulate a problem as a CSP? One
reason is that the CSPs yield a natural rep-
resentation for a wide variety of problems; if you
already have a CSP-solving system, it is often
easier to solve a problem using it than to design a
custom solution using another search technique. In
addition, CSP solvers can be faster than state-space
searchers because the CSP solver can quickly
eliminate large swatches of the search space. For
example, once we have chosen {SA = blue} in the
Section 6.1. Defining Constraint Satisfaction 204
Problems
Australia problem, we can conclude that none of
the five neighbor- ing variables can take on the
value blue. Without taking advantage of constraint
propagation, a search procedure would have to
consider 35 = 243 assignments for the five
neighboring variables; with constraint propagation
we never have to consider blue as a value, so we
have only 25 = 32 assignments to look at, a
reduction of 87%.
In regular state-space search we can only ask:
is this specific state a goal? No? What about this
one? With CSPs, once we find out that a partial
assignment is not a solution, we can
Section 6.1. Defining Constraint Satisfaction 205
Problems
North
ern
Territ Queenslan
ory d
Sout
h N
Austra Western
lia ew
So Australia
uth
W
ale
s
Victoria
Tasmania
(a)
CONSTRAINT LANGUAGE
LINEAR CONSTRAINTS
NONLINEAR CONSTRAINTS
A discrete domain can be infinite, such as
the set of integers or strings. (If we didn’t put a
deadline on the job-scheduling problem, there
would be an infinite number of start times for
each variable.) With infinite domains, it is no
longer possible to describe constraints by
enumerating all allowed combinations of values.
Instead, a constraint language must be used that
understands constraints such as T1 + d1 ≤ T2
directly, without enumerating the set of pairs of
allowable values for (T1, T2). Special solution
algorithms (which we do not discuss here) exist
for linear constraints on integer variables—that
is, constraints, such as the one just given, in
which each variable appears only in linear form.
It can be shown that no algorithm exists for
solving general nonlinear constraints on integer
variables.
Section 6.1. Defining Constraint Satisfaction 209
Problems
CONTINUOUS DOMAINS variety of astronomical, precedence, and power
constraints. The best-known category of
continuous-domain CSPs is that of linear
programming problems, where con- straints must
be linear equalities or inequalities. Linear
programming problems can be solved in time
polynomial in the number of variables. Problems
with different types of constraints and objective
functions have also been studied—quadratic
programming, second-order conic programming,
and so on.
UNARY CONSTRAINT In addition to examining the types of
Constraint satisfaction problems with variables that can appear in CSPs, it is useful to
continuous domains are common in the real world look at the types of constraints. The simplest type
and are widely studied in the field of operations is the unary constraint, which restricts the value
research. For example, the scheduling of of a single variable. For example, in the map-
experiments on the Hubble Space Telescope coloring problem it could be the case that South
requires very precise timing of observations; the Australians won’t tolerate the color green; we can
start and finish of each observation and maneuver express that with the unary constraint ⟨(SA), SA /=
are continuous-valued variables that must obey a green⟩
BINARY CONSTRAINT
A binary constraint relates two NSW is a binary
variables. For example, SA
DUAL GRAPH
constraint. A binary CSP is one with only binary
constraints; it can be represented as a constraint
graph, as in Figure 6.1(b).
GLOBAL CONSTRAINT
We can also describe higher-order
constraints, such as asserting that the value of Y is
between X and Z, with the ternary constraint
Between(X, Y, Z).
A constraint involving an arbitrary number
CRYPTARITHMETIC of variables is called a global constraint. (The
name is traditional but confusing because it need
not involve all the variables in a prob- lem). One
of the most common global constraints is Alldiff ,
which says that all of the variables involved in the
constraint must have different values. In Sudoku
problems (see Section 6.2.6), all variables in a row
or column must satisfy an Alldiff constraint. An-
other example is provided by cryptarithmetic
puzzles. (See Figure 6.2(a).) Each letter in a
cryptarithmetic puzzle represents a different digit.
CONSTRAINT HYPERGRAPH For the case in Figure 6.2(a), this would be
represented as the global constraint Alldiff (F, T,
U, W, R, O). The addition constraints on the four
columns of the puzzle can be written as the
following n-ary constraints:
Section 6.1. Defining Constraint Satisfaction 210
Problems
O + O = R + 10 · C10 ·
C
C
1
1
0
0
+ 0
0
W
C
1
+
0
0
W
0
=
=
U
F
+
,
1
0 where C10, C100, and C1000 are auxiliary variables
representing the digit carried over into the tens,
· hundreds, or thousands column. These constraints
C can be represented in a constraint hypergraph,
1
such as the one shown in Figure 6.2(b). A
0 hypergraph consists of ordinary nodes (the circles
0 in the figure) and hypernodes (the squares), which
represent n-ary constraints.
C Alternatively, as Exercise 6.6 asks you to
1 prove, every finite-domain constraint can be
0
reduced to a set of binary constraints if enough
0
auxiliary variables are introduced, so we could
transform any CSP into one with only binary
+
constraints; this makes the algorithms simpler.
T Another way to convert an n-ary CSP to a binary
one is the dual graph transformation: create a
+ new graph in which there will be one variable for
each constraint in the original graph, and
T
1
0
Section 6.1. Defining Constraint Satisfaction 211
Problems
TWO F T U W R O
+TWO
FOUR
C3 C2 C1
(a) (b)
Figure 6.2 (a) A cryptarithmetic problem. Each letter stands for a distinct digit; the aim
is to find a substitution of digits for letters such that the resulting sum is arithmetically
correct, with the added restriction that no leading zeroes are allowed. (b) The constraint
hypergraph for the cryptarithmetic problem, showing the Alldiff constraint (square box at
the top) as well as the column addition constraints (four square boxes in the middle). The
variables C1, C2, and C3 represent the carry digits for the three columns.
ARC CONSISTENCY
6.2.2 Arc consistency
In regular state-space search, an algorithm can do A variable in a CSP is arc-consistent if every
only one thing: search. In CSPs there is a choice: value in its domain satisfies the variable’s binary
an algorithm can search (choose a new variable constraints. More formally, Xi is arc-consistent
assignment from several possibilities) or do a with respect to another variable Xj if for every
specific type of inference called constraint value in the current domain Di there is some value
propagation: using the constraints to reduce the in the domain Dj that satisfies the binary constraint
number of legal values for a variable, which in turn on the arc (Xi, Xj). A network is arc-consistent if
can reduce the legal values for another variable, every variable is arc consistent with every other
and so on. Constraint propagation may be variable. For example, consider the constraint Y =
intertwined with search, or it may be done as a X2 where the domain of both X and Y is the set of
preprocessing step, before search starts. Sometimes digits. We can write this constraint explicitly as
this preprocessing can solve the whole problem, so ⟨(X, Y ), {(0, 0), (1, 1), (2, 4), (3, 9))}⟩ .
no search is required at all.
The key idea is local consistency. If we treat To make X arc-consistent with respect to Y , we
reduce X’s domain to {0, 1, 2, 3}. If we also make
each variable as a node in a graph (see Figure
Y arc-consistent with respect to X, then Y ’s domain
6.1(b)) and each binary constraint as an arc, then
becomes {0, 1, 4, 9} and the whole CSP is arc-
the process of enforcing local con- sistency in each consistent.
part of the graph causes inconsistent values to be On the other hand, arc consistency can do
Section 6.1. Defining Constraint Satisfaction 214
Problems
nothing for the Australia map-coloring prob- lem.
Consider the following inequality constraint on
(SA, WA):
{(red , green), (red , blue), (green, red ),
(green, blue), (blue, red ), (blue, green)} .
Section 6.2. Constraint Propagation: Inference in 209
CSPs
function AC-3( csp) returns false if an inconsistency is found and true otherwise
inputs: csp, a binary CSP with components (X, D, C)
local variables: queue, a queue of arcs, initially all the arcs in csp
while queue is not empty do
(Xi, Xj) ← REMOVE-FIRST(queue)
if REVISE(csp, Xi, Xj) then
if size of Di = 0 then return false
for each Xk in Xi.NEIGHBORS - {Xj} do
add (Xk, Xi) to queue
return true
function REVISE( csp, Xi, Xj) returns true iff we revise the domain of Xi
revised ← false
for each x in Di do
if no value y in Dj allows (x ,y) to satisfy the constraint between Xi and Xj then
delete x from Di
revised ← true
return revised
Figure 6.3 The arc-consistency algorithm AC-3. After applying AC-3, either every
arc is arc-consistent, or some variable has an empty domain, indicating that the CSP
cannot be solved. The name “AC-3” was used by the algorithm’s inventor (Mackworth,
1977) because it’s the third version developed in the paper.
No matter what value you choose for SA (or for WA), there is a valid value for the other
variable. So applying arc consistency has no effect on the domains of either variable.
The most popular algorithm for arc consistency is called AC-3 (see Figure 6.3). To
make every variable arc-consistent, the AC-3 algorithm maintains a queue of arcs to
consider. (Actually, the order of consideration is not important, so the data structure is
really a set, but tradition calls it a queue.) Initially, the queue contains all the arcs in the
CSP. AC-3 then pops off an arbitrary arc (Xi, Xj) from the queue and makes Xi arc-
consistent with respect to Xj. If this leaves Di unchanged, the algorithm just moves on to the
next arc. But if this revises Di (makes the domain smaller), then we add to the queue all
arcs (Xk, Xi) where Xk is a neighbor of Xi. We need to do that because the change in Di
might enable further reductions in the domains of Dk, even if we have previously
considered Xk. If Di is revised down to nothing, then we know the whole CSP has no
consistent solution, and AC-3 can immediately return failure. Otherwise, we keep
checking, trying to remove values from the domains of variables until no more arcs are in
the queue. At that point, we are left with a CSP that is equivalent to the original CSP—they
both have the same solutions—but the arc-consistent CSP will in most cases be faster to
search because its variables have smaller domains.
The complexity of AC-3 can be analyzed as follows. Assume a CSP with n variables,
each with domain size at most d, and with c binary constraints (arcs). Each arc (Xk, Xi) can
be inserted in the queue only d times because Xi has at most d values to delete. Checking
Section 6.2. Constraint Propagation: Inference in 210
CSPs
Arc consistency can go a long way toward
reducing the domains of variables, sometimes
finding a solution (by reducing every domain to
GENERALIZED ARC CONSISTENT size 1) and sometimes finding that the CSP cannot
be solved (by reducing some domain to size 0).
But for other networks, arc consistency fails to
make enough inferences. Consider the map-
coloring problem on Australia, but with only two
colors allowed, red and blue. Arc consistency can
do nothing because every variable is already arc
consistent: each can be red with blue at the other
end of the arc (or vice versa). But clearly there is
no solution to the problem: because Western
Australia, Northern Territory and South Australia
all touch each other, we need at least three colors
for them alone.
Arc consistency tightens down the domains
(unary constraints) using the arcs (binary
constraints). To make progress on problems like
map coloring, we need a stronger notion of
consistency. Path consistency tightens the binary
constraints by using implicit constraints that are
inferred by looking at triples of variables.
A two-variable set {Xi, Xj} is path-consistent
PATH CONSISTENCY with respect to a third variable Xm if, for every
assignment {Xi = a, Xj = b} consistent with the
consistency of an arc can be done in O(d2) time, so
constraints on {Xi, Xj}, there is an assignment to
we get O(cd3) total worst-case time.1 Xm that satisfies the constraints on {Xi, Xm} and
It is possible to extend the notion of arc {Xm, Xj}. This is called path consistency because
consistency to handle n-ary rather than just binary one can think of it as looking at a path from Xi to
constraints; this is called generalized arc Xj with Xm in
consistency or sometimes hyperarc consis- tency, the middle.
depending on the author. A variable Xi is Let’s see how path consistency fares in
generalized arc consistent with respect to an n- coloring the Australia map with two colors. We
ary constraint if for every value v in the domain of will make the set {WA, SA} path consistent with
Xi there exists a tuple of values that is a member of respect to NT . We start by enumerating the
the constraint, has all its values taken from the consistent assignments to the set. In this case, there
domains of the corresponding variables, and has its are only two: {WA = red , SA = blue} and {WA =
Xi component equal to v. For example, if all blue, SA = red }. We can see that with both of
variables have the do- main {0, 1, 2, 3}, then to these assignments NT can be neither red nor blue
make the variable X consistent with the constraint (because it would conflict with either WA or SA).
X < Y < Z, we would have to eliminate 2 and 3 Because there is no valid choice for NT , we
from the domain of X because the constraint eliminate both assignments, and we end up with no
cannot be satisfied when X is 2 or 3. valid assignments for {WA, SA}. Therefore, we
know that there can be no solution to this problem.
The PC-2 algorithm (Mackworth, 1977) achieves
6.2.3 Path consistency path consistency in much the same way that AC-3
achieves arc consistency. Because it is so similar,
Section 6.2. Constraint Propagation: Inference in 211
CSPs
we do not show it here.
1
The AC-4 algorithm (Mohr and Henderson, 1986) runs in O(cd2) worst-case time but can be slower than AC-
3 on average cases. See Exercise 6.13.
Section 6.2. Constraint Propagation: Inference in 212
CSPs
6.2.4 K-consistency
K-CONSISTENCY Stronger forms of propagation can be defined
with the notion of k-consistency. A CSP is k-
consistent if, for any set of k − 1 variables and for
any consistent assignment to those variables, a
consistent value can always be assigned to any
kth variable. 1-consistency says that, given the
STRONGLY
K-CONSISTENT empty set, we can make any set of one variable
consistent: this is what we called node
consistency. 2-consistency is the same as arc
consistency. For binary constraint networks, 3-
consistency is the same as path consistency.
A CSP is strongly k-consistent if it is k-
consistent and is also (k − 1)-consistent, (k − 2)-
consistent, ... all the way down to 1-consistent.
Now suppose we have a CSP with n nodes and
make it strongly n-consistent (i.e., strongly k-
consistent for k = n). We can
then solve the problem as follows: First, we
choose a consistent value for X1. We are then
guaranteed to be able to choose a value for X2
because the graph is 2-consistent, for X3 because
it is 3-consistent, and so on. For each variable Xi,
we need only search through the d values in the
domain to find a value consistent with X1,... , Xi−1.
We are guaranteed to find a solution in time
O(n2d). Of course, there is no free lunch: any
algorithm for establishing n-consistency must
take time exponential in n in the worst case.
Worse, n-consistency also requires space that is
exponential in n. The memory issue is even more
severe than the time.
In practice, determining the appropriate level of
consistency checking is mostly an empirical
science. It can be said practitioners commonly
compute 2-consistency and less commonly 3-
consistency.
A A 4
B B 8
3 3
C C
D 2 D
9
E
2
E
6 1
F F
6
G G
5
H 9 H 7
I I 9
6
(a) (b)
Alldiff constraints: one for each row, column, and box of 9 squares.
Alldiff (A1, A2, A3, A4, A5, A6, A7, A8, A9)
Alldiff (B1, B2, B3, B4, B5, B6, B7, B8, B9)
···
Alldiff (A1, B1,C1, D1, E1,F 1, G1,H1,I1)
Alldiff (A2, B2,C2, D2, E2,F 2, G2,H2,I2)
···
Alldiff (A1, A2, A3, B1, B2, B3,C1,C2,C3)
Alldiff (A4, A5, A6, B4, B5, B6,C4,C5,C6)
···
Let us see how far arc consistency can take us. Assume that the Alldiff constraints have
been expanded into binary constraints (such as A1 /= A2 ) so that we can apply the AC-3
algorithm directly. Consider variable E6 from Figure 6.4(a)—the empty square between the
2 and the 8 in the middle box. From the constraints in the box, we can remove not only 2
and 8 but also 1 and 7 from E6 ’s domain. From the constraints in its column, we can
eliminate 5, 6, 2, 8, 9, and 3. That leaves E6 with a domain of {4}; in other words, we
know the answer for E6 . Now consider variable I6 —the square in the bottom middle box
surrounded by 1, 3, and 3. Applying arc consistency in its column, we eliminate 5, 6, 2, 4
(since we now know E6 must be 4), 8, 9, and 3. We eliminate 1 by arc consistency with I5 ,
and we are left with only the value 7 in the domain of I6 . Now there are 8 known values in
column 6, so arc consistency can infer that A6 must be 1. Inference continues along these
lines, and eventually, AC-3 can solve the entire puzzle—all the variables have their
domains reduced to a single value, as shown in Figure 6.4(b).
Of course, Sudoku would soon lose its appeal if every puzzle could be solved by a
Section 6.2. Constraint Propagation: Inference in 217
CSPs
mechanical application of AC-3, and indeed AC-3 works only for the easiest Sudoku
puzzles. Slightly harder ones can be solved by PC-2, but at a greater computational cost:
there are 255,960 different path constraints to consider in a Sudoku puzzle. To solve the
hardest puzzles and to make efficient progress, we will have to be more clever.
Indeed, the appeal of Sudoku puzzles for the human solver is the need to be
resourceful in applying more complex inference strategies. Aficionados give them colorful
names, such as “naked triples.” That strategy works as follows: in any unit (row, column or
box), find three squares that each have a domain that contains the same three numbers or a
subset of those numbers. For example, the three domains might be {1, 8}, {3, 8}, and {1,
3, 8}. From that we don’t know which square contains 1, 3, or 8, but we do know that the
three numbers must be distributed among the three squares. Therefore we can remove 1, 3,
and 8 from the domains of every other square in the unit.
It is interesting to note how far we can go without saying much that is specific to Su-
doku. We do of course have to say that there are 81 variables, that their domains are the
digits 1 to 9, and that there are 27 Alldiff constraints. But beyond that, all the strategies—arc
con- sistency, path consistency, etc.—apply generally to all CSPs, not just to Sudoku
problems. Even naked triples is really a strategy for enforcing consistency of Alldiff
constraints and has nothing to do with Sudoku per se. This is the power of the CSP
formalism: for each new problem area, we only need to define the problem in terms of
constraints; then the general constraint-solving mechanisms can take over.
WA=red WA=red
NT=green NT=blue
WA=red WA=red
NT=green NT=green
Q=red Q=blue
Figure 6.6 Part of the search tree for the map-coloring problem in Figure 6.1.
The backtracking algorithm contains the line
var ← SELECT-UNASSIGNED-VARIABLE(csp) .
The simplest strategy for SELECT-UNASSIGNED-
VARIABLE is to choose the next unassigned
variable in order, {X1, X2,.. .}. This static variable
ordering seldom results in the most effi- cient
search. For example, after the assignments for WA
= red and NT = green in Figure 6.6, there is only
one possible value for SA, so it makes sense to
assign SA = blue next rather than assigning Q. In
fact, after SA is assigned, the choices for Q, NSW ,
and V are all forced. This intuitive idea—choosing
the variable with the fewest “legal” values—is
called the minimum- remaining-values (MRV)
heuristic. It also has been called the “most
MINIMUM- REMAINING-VALUES constrained variable” or “fail-first” heuristic, the
latter because it picks a variable that is most likely
to cause a failure soon, thereby pruning the search
tree. If some variable X has no legal values left, the
MRV heuristic will select X and failure will be
detected immediately—avoiding pointless searches
through other variables. The MRV heuristic
usually performs better than a random or static
DEGREE HEURISTIC ordering, sometimes by a factor of 1,000 or more,
2. What inferences should be performed at each although the results vary widely depending on the
step in the search (INFERENCE)? problem.
3. When the search arrives at an assignment The MRV heuristic doesn’t help at all in
that violates a constraint, can the search choosing the first region to color in Australia,
avoid repeating this failure? because initially every region has three legal
colors. In this case, the degree heuristic comes in
The subsections that follow answer each of these handy. It attempts to reduce the branching factor
questions in turn. on future choices by selecting the vari- able that is
involved in the largest number of constraints on
6.3.1 Variable and value ordering
other unassigned variables. In Figure 6.1, SA is the
Section 6.3. Backtracking Search for 217
CSPs
variable with highest degree, 5; the other variables
have degree 2 or 3, except for T , which has degree
0. In fact, once SA is chosen, applying the degree
heuris- tic solves the problem without any false
steps—you can choose any consistent color at each
choice point and still arrive at a solution with no
backtracking. The minimum-remaining-
Section 6.3. Backtracking Search for 218
CSPs
have to consider every value anyway. The same
holds if there are no solutions to the problem.
Why should variable selection be fail-first,
LEAST- CONSTRAINING- VALUE but value selection be fail-last? It turns out that,
for a wide variety of problems, a variable
ordering that chooses a variable with the
minimum number of remaining values helps
minimize the number of nodes in the search tree
by pruning larger parts of the tree earlier. For
value orderi ng, the trick is that we only need one
solution; therefore it makes sense to look for the
most likely values first. If we wanted to
enumerate all solutions rather than just find one,
then value ordering would be irrelevant.
Initial domains
R
G
B R GB R GB R GB R GBAfter R WA=red
G
B R B
G
R B R B R B R BAfter B R B
G G G G G G
R B R B R BQ=green R B
B After
G G G
R B R B V=blue R B
Figure 6.7 The progress of a map-colorin
is assigned first; then forward checking dele
variables NT and SA. After Q = green is as
NT , SA, and NSW . After V = blue is assign
and SA, leaving SA with no legal values.
Figure 6.8 The MIN-CONFLICTS algorithm for solving CSPs by local search. The initial
state may be chosen randomly or by a greedy assignment process that chooses a
minimal- conflict value for each variable in turn. The C ONFLICTS function counts the
number of constraints violated by a particular value, given the rest of the current
assignment.
Figure 6.9 A two-step solution using min-conflicts for an 8-queens problem. At each
stage, a queen is chosen for reassignment in its column. The number of conflicts (in this
case, the number of attacking queens) is shown in each square. The algorithm moves the
queen to the min-conflicts square, breaking ties randomly.
heuristic. The algorithm is shown in Figure 6.8 and its application to an 8-queens problem
is diagrammed in Figure 6.9.
Min-conflicts is surprisingly effective for many CSPs. Amazingly, on the n-queens
problem, if you don’t count the initial placement of queens, the run time of min-conflicts is
roughly independent of problem size. It solves even the million-queens problem in an aver-
age of 50 steps (after the initial assignment). This remarkable observation was the stimulus
leading to a great deal of research in the 1990s on local search and the distinction between
easy and hard problems, which we take up in Chapter 7. Roughly speaking, n-queens is
easy for local search because solutions are densely distributed throughout the state space.
Min-conflicts also works well for hard problems. For example, it has been used to schedule
observations for the Hubble Space Telescope, reducing the time taken to schedule a week
of observations from three weeks (!) to around 10 minutes.
Section 6.5. The Structure of 222
Problems
the important constraints. Each constraint is
given a numeric weight, Wi, initially all 1. At
each step of the search, the algorithm chooses a
variable/value pair to change that will result in
the lowest total weight of all violated constraints.
The weights are then adjusted by incrementing
the weight of each constraint that is violated by
the current assignment. This has two benefits: it
CONSTRAINT WEIGHTING adds topography to plateaux, making sure that it
All the local search techniques from is possible to improve from the current state, and
Section 4.1 are candidates for application to it also, over time, adds weight to the constraints
CSPs, and some of those have proved especially that are proving difficult to solve.
effective. The landscape of a CSP under the min- Another advantage of local search is that it
conflicts heuristic usually has a series of can be used in an online setting when the
plateaux. There may be millions of variable as- problem changes. This is particularly important
signments that are only one conflict away from a in scheduling problems. A week’s airline
solution. Plateau search—allowing side- ways schedule may involve thousands of flights and
moves to another state with the same score—can tens of thousands of personnel assignments, but
help local search find its way off this plateau. bad weather at one airport can render the
This wandering on the plateau can be directed schedule infeasible. We would like to repair the
with tabu search: keeping a small list of schedule with a minimum number of changes.
recently visited states and forbidding the This can be easily done with a local search
algorithm to return to those states. Simulated algorithm starting from the current schedule. A
annealing can also be used to escape from backtracking search with the new set of
plateaux. constraints usually requires much more time and
Another technique, called constraint might find a solution with many changes from
weighting, can help concentrate the search on the current schedule.
NT NT
Q Q
WA WA
SA NSW
NSW
V V
T T
(a) (b)
Figure 6.12 (a) The original constraint graph from Figure 6.1. (b) The constraint
graph after the removal of SA.
deleting from the domains of the other variables any values that are inconsistent with the
value chosen for SA.
Now, any solution for the CSP after SA and its constraints are removed will be con-
sistent with the value chosen for SA. (This works for binary CSPs; the situation is more
complicated with higher-order constraints.) Therefore, we can solve the remaining tree with
the algorithm given above and thus solve the whole problem. Of course, in the general case
(as opposed to map coloring), the value chosen for SA could be the wrong one, so we would
need to try each possible value. The general algorithm is as follows:
Section 6.5. The Structure of 226
Problems
2. For each possible assignment to the variables
CYCLE CUTSET
in S that satisfies all constraints on S,
(a) remove from the domains of the
remaining variables any values that are
inconsis- tent with the assignment for
S, and
(b) If the remaining CSP has a solution,
return it together with the assignment for
S.
If the cycle cutset has size c, then the total run
time is O(dc · (n − c)d2): we have to try each of
the dc combinations of values for the variables in
CUTSET CONDITIONING
S, and for each combination we must solve a tree
TREE DECOMPOSITION problem of size n − c. If the graph is “nearly a
tree,” then c will be small and the savings over
straight backtracking will be huge. In the worst
case, however, c can be as large as (n − 2).
Finding the smallest cycle cutset is NP-hard, but
several efficient approximation algorithms are
known. The overall algorithmic approach is called
cutset conditioning; it comes up again in Chapter
14, where it is used for reasoning about
probabilities.
The second approach is based on
constructing a tree decomposition of the
constraint graph into a set of connected
subproblems. Each subproblem is solved
independently, and the resulting solutions are then
combined. Like most divide-and-conquer
algorithms, this works well if no subproblem is
too large. Figure 6.13 shows a tree decomposition
of the map- coloring problem into five
subproblems. A tree decomposition must satisfy
the following three requirements:
• Every variable in the original problem
appears in at least one of the subproblems.
• If two variables are connected by a
constraint in the original problem, they must
appear together (along with the constraint)
in at least one of the subproblems.
• If a variable appears in two subproblems in
the tree, it must appear in every subproblem
along the path connecting those
TREE WIDTH
subproblems.
1. Choose a subset S of the CSP’s variables
The first two conditions ensure that all the
such that the constraint graph becomes a
variables and constraints are represented in the
tree after removal of S. S is called a cycle
decomposition. The third condition seems rather
cutset.
Section 6.5. The Structure of 227
Problems
technical, but simply reflects the constraint that
any given variable must have the same value in
every subproblem in which it appears; the links
joining subproblems in the tree enforce this
constraint. For example, SA appears in all four of
the connected subproblems in Figure 6.13. You
can verify from Figure 6.12 that this
decomposition makes sense.
We solve each subproblem independently; if
any one has no solution, we know the en- tire
problem has no solution. If we can solve all the
subproblems, then we attempt to construct a
global solution as follows. First, we view each
subproblem as a “mega-variable” whose do- main
is the set of all solutions for the subproblem. For
example, the leftmost subproblem in Figure 6.13
is a map-coloring problem with three variables
and hence has six solutions—one is {WA = red ,
SA = blue, NT = green}. Then, we solve the
constraints connecting the subproblems, using the
efficient algorithm for trees given earlier. The
constraints between subproblems simply insist
that the subproblem solutions agree on their
shared variables. For example, given the solution
{WA = red , SA = blue, NT = green} for the first
subproblem, the only consistent solution for the
next subproblem is {SA = blue, NT = green,Q =
red }. A given constraint graph admits many tree
decompositions; in choosing a decompo- sition,
the aim is to make the subproblems as small as
possible. The tree width of a tree
Section 6.5. The Structure of 228
Problems
NT NT
Q
WA
SA SA
SA NSW
SA NSW
T
V
decomposition of a graph is one less than the size of the largest subproblem; the tree
width of the graph itself is defined to be the minimum tree width among all its tree
decompositions. If a graph has tree width w and we are given the corresponding tree
decomposition, then the problem can be solved in O(ndw+1) time. Hence, CSPs with
constraint graphs of bounded tree width are solvable in polynomial time. Unfortunately,
finding the decomposition with minimal tree width is NP-hard, but there are heuristic
methods that work well in practice.
So far, we have looked at the structure of the constraint graph. There can be
important structure in the values of variables as well. Consider the map-coloring
problem with n colors. For every consistent solution, there is actually a set of n!
solutions formed by permuting the color names. For example, on the Australia map we
know that WA, NT , and SA must all have different colors, but there are 3! = 6 ways to
assign the three colors to these three regions. This is called value symmetry. We would
like to reduce the search space by a factor of n! by breaking the symmetry. We do this
by introducing a symmetry-breaking constraint. For our example, we might impose
an arbitrary ordering constraint, NT < SA < WA, that requires the three values to be in
alphabetical order. This constraint ensures that only one of the n! solutions is possible:
{NT = blue, SA = green, WA = red }.
For map coloring, it was easy to find a constraint that eliminates the symmetry,
and in general it is possible to find constraints that eliminate all but one symmetric
solution in polynomial time, but it is NP-hard to eliminate all symmetry among
intermediate sets of values during search. In practice, breaking value symmetry has
proved to be important and effective on a wide range of problems.