Lecture 6
Lecture 6
Lecture 6
Contents
1 Introduction 2
2 Arc Types 2
4 Examples 4
4.3.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.4.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1
2 ARC TYPES
1 Introduction
Here we will illustrate the solution of Influence Diagrams without resorting to decision trees.
The development is not theoretical, although there is plenty of theory to support it. It will
be intuitive and hopefully we will not have to take steps which appear illogical.
• Repeatedly removing nodes, one at a time, using common logic and arc reversals, i.e.
Bayes Rule.
• Stop when there is only a single node remaining, showing the present expected utility
of the optimal policy.
2 Arc Types
Previously we listed 3 types of nodes (decision, event and value nodes). For completeness,
we also list 4 types types of arcs.
1. Functional arcs (from a chance or decision node to the value node) indicate that the
utility function is a function of all the from nodes.
2. Conditional arcs (from a chance or decision node to a chance node) indicate that the
chance node is probabilistically conditioned on all the from nodes.
3. Informational arcs (from a chance node to a decision node) indicate that the decision
is made with the outcome of all the from chance nodes known beforehand.
4. No-forgetting arcs (from a decision node to a decision node) indicate that the decision
is made with the outcome of all the from decision nodes known beforehand. These are
often not explicitly shown, as all decisions are assumed to know the outcome of earlier
decisions.
The algorithm is due to R.D. Shachter (1986), and the steps below are based on the notes
by Professor Kriwaczek at Imperial College (both sources are posted under references at the
course website):
As long as there are one or more nodes that point into the value node, remove nodes as
follows :
1. Barren Nodes: Eliminate all barren nodes (except the value node) that do not point
to another node. (None of the examples in these notes include such nodes.)
2. Policy Determination: If there is a decision node that points into the value node,
and if all other nodes that point into the value node also point into that decision node,
remove the decision node by selecting the decision that maximizes the expected utility.
3. Averaging: If there is a chance node that points into only the value node, remove it
by averaging.
4. Arc Reversal: If a chance node points into the value node and into chance nodes(s)
and not into any decision node, reverse one or more arcs pointing from that chance
node (without creating a cycle). In some complex situations, beyond this course, new
arcs may need to be added.
After a node of any type is removed, draw the arcs from its parents to the value node.
Most importantly, any changes to the diagram requires appropriate updating to the node
tables as we will see below.
4 Examples
This problem concerns deciding on a treatment under different symptom scenarios. The ID
including the data tables is shown in Figure 2.
D P(D) D T C(D,T)
D S P(S|D)
1 0.5 1 d1 100
1 a 0.9 {d1,d2}
2 0.5 2 d1 200
1 not a 0.1
2 a 0.2 1 d2 400
2 not a 0.8 2 d2 50
Figure 2: Doctor ID
There are clearly no Barren Nodes to remove, and neither Policy Determination nor Aver-
aging can be used to remove nodes. Arc Reversal would reverse arcs between the Symptom
and Disease nodes, after which Averaging can be used to remove the Disease node.
Figure 3 shows the Arc Reversal which requires the Symptom and Disease tables to be
updated according to Bayes Rule. For example
P (a|1)P (1) (0.9)(0.5) 0.45 9
P (1|a) = = = =
P (a|1)P (1) + P (a|2)P (2) (0.9)(0.5) + (0.2)(0.5) 0.55 11
This also yields P (a) = 0.55.
S P(S) S T C(S,T)
a 0.55 a d1 118.18
{d1,d2}
not a 0.45 a d1 336.36
not a d2 188.89
not a d2 88.89
Figure 4 removes the Disease node, which requires C(D, T ) to be replaced by C(S, T ). This
is done by Averaging out, so that C(S, T ) = C(1, T )P (1|S) + C(2, T )P (2|S),e.g. C(a, d1) =
100(9/11) + 200(2/11) = 118.18.
Policy Determination removes the Treatment Node as shown in Figure 5. Note that the last
column in the Cost table records the optimal decision for each symptom.
Symptom(S) Cost
S P(S) S C(S) T
a 0.55 a 118.18 d1
not a 0.45 not a 88.89 d2
Algorithm ends by averaging out the S node for a present expected cost of 0.55*118+0.45*89=104.95.
Z D1 Y D2 VALUE
Z p(Z)
0 0.8 {0,1,2} D1 Z Y Pr(Y|Z) Z D1 D2 COST
{a1,a2}
1 0.2 0 * 0 1 0 0 a1 10
1 0 + 0.4 0 0 a2 8
1 0 - 0.6 0 2 a1 9.5
Legend
D1: Decide to do no scans(0), 1 scan (1) or 2 1 1 + 0.9 0 0 a2 7.5
scans (2) 1 1 - 0.1 0 1 a1 9.1
D2: Decide to send as is (a1) or to overhaul
and repair if necessary (a2) 2 0 ++ 0.16 0 2 a2 7.1
Y: Result of scan(s): no observation (0), 1 0 a1 0
positive (+), or negative (-) 2 0 +- 0.48
Z: State of the Machinery: flat (0), or faulty (1). 2 0 -- 0.36 1 1 a2 7
2 1 ++ 0.81 1 2 a1 -0.5
2 1 +- 0.18 1 0 a2 6.5
2 1 -- 0.01 1 1 a1 -0.9
1 2 a2 6.1
Arc Reversal. Figure 7 starts the process, by reversing the arc between Y and Z. We use
Baye’s Rule as usual to derive the Y and Z tables.
Z D1 Y D2 VALUE
Y Z p(Z) D1 Y Pr(Y)
{0,1,2} 0 0 1
Z D1 D2 COST
0 0 0.8 {a1,a2}
0 0 a1 10
0 1 0.2 1 + 0.5
+ 0 0.64 0 0 a2 8
1 - 0.5
+ 1 0.36 0 2 a1 9.5
2 ++ 0.29
- 0 0.96 0 0 a2 7.5
2 +- 0.42
- 1 0.04 0 1 a1 9.1
2 -- 0.29
++ 0 0.44 0 2 a2 7.1
++ 1 0.56 1 0 a1 0
+- 0 0.91
1 1 a2 7
+- 1 0.09
1 2 a1 -0.5
-- 0 0.99
-- 1 0.01 1 0 a2 6.5
1 1 a1 -0.9
1 2 a2 6.1
Averaging. Since the chance node Z in Figure 7 only points into the value node it can be
removed by averaging, and its parents now point directly into the value node. Therefore the
structure of the COST table is changed and its entries are obtained by an averaging process
using the tables in Figure 8. For example
COST (2, +−, a1) = COST (2, 0, a1)P (0| + −) + COST (2, 1, a1)P (1| + −)
= (9.1)(0.91) + (−0.9)(0.09) = 8.2
D1 Y D2 VALUE
D1 Y Pr(Y)
{0,1,2} 0 0 1 D1 Y D2 COST
{a1,a2}
1 + 0.5 0 0 a1 8
1 - 0.5 0 0 a2 7.8
2 ++ 0.29 1 + a1 5.9
2 +- 0.42 1 + a2 7.14
2 -- 0.29 1 - a1 9.1
1 - a2 7.46
2 ++ a1 3.5
2 ++ a2 6.54
2 +- a1 8.2
2 +- a2 7.01
2 -- a1 9
2 -- a2 7.09
D1 Y VALUE
D1 Y Pr(Y)
{0,1,2} D1 Y COST D2
0 0 1
1 + 0.5 0 0 8 a1
1 - 0.5 1 + 7.14 a2
2 ++ 0.29 1 - 9.1 a1
2 +- 0.42 2 ++ 6.54 a2
2 -- 0.29 2 +- 8.2 a1
2 -- 9 a1
Averaging. It now becomes clear that we can average out Y by using P (Y ) values from
Figure 9. It also becomes evident that we can no longer carry the D2 column in the COST
table since these values may depend on Y . (We would simply store the previous cost table
elsewhere to keep the optimal solution). A sample calculation of the cost table follows:
COST (2, −) = COST (2, −, +)P (+|−)+COST (2, −, −)P (−|−) = (8.24)(0.42)+(9.03)(0.58) = 8.7
Thus based on on Figure 9 the solution is to perform one scan. If the scan is positive,
D1 COST
0 8
D1 VALUE 1 8.12
2 7.95
{0,1,2}
overhaul and if need be repair for $7,140. If the scan is negative, then send as-is for $9,100.
Prior to seeing the result of the scan the expected value is $8,120.
Your friend has offered you the opportunity to bet 5$ on the toss of a coin - heads you receive
$100, tails you receive nothing. You are ready to open your wallet when he warns you not to
get mad if the coin turns out to be an unfair coin. Before you walk away you think this may
be interesting, and plan to analyze it with a linear utility function. Knowing your friend you
figure there is about a 80% chance the coin has tails on both sides, but that there is still a
20% chance the coin is fair and it may be a good bet. You ask your friend to toss the coin
once before deciding if you should place your bet. How much should you pay for this toss?
4.3.1 Formulation
So far we have aligned the nodes in Historical order on a straight line. In the next two
examples we show them in more traditional form. Suppose this time we have not thought
in terms of decision tress but only want to model using Influence Diagrams. We know that
– Extensive: the order in which the DM encounters the events and decisions, used
in decision trees
– Historical: the order in which the decisions and events actually occur, used in
influence diagrams.
We need to have some short names or symbols to work with so, let
• Trial? - the decision whether to pay $x for a trial toss before deciding to play the game
• Trial - the event weather the trial toss falls head or tail.
• Coin - the event whether the coin is Fair (50% chance of heads) or Unfair (0% of heads)
• Toss - the event whether the toss during the game falls heads or tails.
The sequence of events, decisions and rewards in both Extensive and Historical sequence are:
As usual, the historical order is followed. Also the tables are shown in more compact form,
and the tables use joint probabilities to help with Bayes’s Rule. The same result is obtained
as using Baye’s Rule as before.
Trial?
Reward
{y,n}
Trial
Coin
Play?
{Y,N}
Toss
Once the ID model has been formulated then we can provide the model to a solver such as
Netica, or solve it by hand. Here we illustrate the solution by hand.
Following the Node Removal Algorithm we note that we cannot yet apply Policy Determi-
nation, but we can apply Averaging, since the Toss node only points into the Reward node.
The Coin node now points directly to the Reward Node, and the dependence on the Play?
node is already reflected in the Reward table. The result is shown in Figure 12
Trial?
Reward
{y,n}
Trial
Coin
Play? {Y,N}
We next remove the Coin node by Arc Reversal (Fig. 13) and Averaging (Fig. 14).
Trial?
Reward
{y,n}
Trial?
Reward
{y,n}
Trial
Trial
Coin
Play? {Y,N}
Play? {Y,N}
Figure 13: Trial-Coin Are Reversal Figure 14: Remove Coin Node by Averaging
Next we remove the Play? node by Policy Determination (Figure 15), and subsequently the
Trial node by Averaging (Figure 16).
Trial?
Reward
{y,n}
Trial
Trial? Reward
{y,n}
We can now see that we are not willing to pay any amount for the trial, so we proceed to
play for an expected reward of $5.
An oil company is installing an oil pipeline from an oil field to a refinery. The pipeline requires
the welding of 1000 seams, to be carried out by the company’s own welders. Defective seams
result in leaks, which must be reworked at a cost of $1,200 per seam. The company can also
hire an expert cleanup team of welders at a one-time cost of $130,000, who would check all of
the welds done by the company welders and repair them as required. The company can also
improve its information about the quality of its own welders on this job, by x-ray inspection
of a randomly selected completed weld at a cost of $2,000. Assuming a prior distribution
of 30% chance of 5% defective welds, 50% of 10% defective welds and 20% of 20% defective
welds, use a linear utility function to determine the optimal decision.
4.4.1 Formulation
• X-ray? - the decision whether to pay $2000 to x-ray a random weld performed by
internal workers.
• Fault% - the event whether the % of defects by internal welders is 5%, 10% or 20%
• Cost incurred
The historical sequence of events, decisions and rewards is: Historical: Fault%, X-ray?,
X-ray, Internal? Fault#y, Cost. The ID for the Oil Example is shown in Fig 17
4.4.2 Solution
{y,n}
X-Ray?