Software Testing, Vi Sem 17is63 Structural Testing
Software Testing, Vi Sem 17is63 Structural Testing
Software Testing, Vi Sem 17is63 Structural Testing
Module 3
Structural Testing
Structural testing, also known as glass box testing or white box testing is an approach where the
tests are derived from the knowledge of the software's structure or internal implementation. The
other names of structural testing include clear box testing, open box testing, logic driven testing or
path driven testing.
Structural Testing Techniques:
Statement Coverage - This technique is aimed at exercising all programming statements
with minimal tests.
Branch Coverage - This technique is running a series of tests to ensure that all branches are
tested at least once.
Path Coverage - This technique corresponds to testing all possible paths which means that
each statement and branch is covered.
STATEMENT TESTING
Statements are nothing but the nodes of the control flow graph.
Statement adequacy criterion: Let T be a test suite for a program P. T satisfies the statement
adequacy criterion for P, iff, for each statement S of P, there exists at least one test case in T that
causes the execution of S. This is equivalent to stating that every node in the control flow graph
model of the program 1 is visited by some execution path exercised by a test case in T.
Statement coverage: The statement coverage Cstatement of T for P is the fraction of statements of program
P executed by at least one test case in T
Basic block coverage: Nodes in a control flow graph often represent basic blocks rather than individual
statements, and so some standards refers to basic coverage or node coverage
• Designing complex test cases that exercise many different elements of a unit is seldom a good
way to optimize a test suite, although it may occasionally be justifiable when there is large and
unavoidable fixed cost (e.g., setting up equipment) for each test case regardless of complexity.
• Control flow coverage may be measured incrementally while executing a test suite.
• The increment of coverage due to the execution of a specific test case does not measure the
absolute efficacy of the test case.
• Measures independent from the order of execution may be obtained by identifying independent
statements.
Path Coverage: It aims to test the different path from entry to the exit of the program, which is a
combination of different decisions taken in the sequence.
• The paths can be too many to be considered for testing, for ex, a loop can go on and on. It can be
avoided using cyclomatic complexity, which help in finding out the redundant test cases.
• [Cyclomatic Complexity is quantitative measure of no. of linearly independent paths through a
program and is used to indicate complexity of a program]. So, cyclomatic complexity helps
aiming to test all linearly independent paths in a program at least once. These were some of the
test coverage under this Testing.
Figure 3.1: The C function cgi decode, which translates a cgi-encoded string to a plain ASCII string (reversing the encoding
applied by the common gateway interface of most web servers).
BRANCH TESTING
• A test suite can achieve complete statement coverage without executing all the possible branches
in a program.
• Consider, for example, a faulty program cgi decode0 obtained from program cgi decode by
removing line 41.
• The control flow graph of program cgi decode0 is shown in Figure 3.2.
• In the new program there are no statements following the false branch exiting node
• Branch adequacy criterion requires each branch of the program t be executed by at least one test
case.
Figure 3.2: The control flow graph of function cgi decode0 which is obtained from the program of Figure
3.1 after removing node F.
• Let T be a test suite for a program P. T satisfies the branch adequacy criterion for P. Iff , for
each branch B of P, there exists at least one test case in T that causes the execution of B. This is
equivalent to stating that every edge the control flow graph model of program P belongs to some
execution path exercised by a test case in T
• The branch coverage 𝐶𝐵𝑟𝑎𝑛𝑐ℎ of T for P is the fraction of branches of program P executed by at
least one test case in T
Examples:
𝑇3={“ “,”+%0D+4J”}
100% statement coverage 88% branch coverage 𝐶𝐵𝑟𝑎𝑛𝑐ℎ=7/8=0.88
𝑇2={“%3D”,”%A”,”a+b”,”test”}
Chayashree G,, ISE, NIEIT, MYSURU Page 4
Software Testing, VI sem 17IS63
Test suite T2 satisfies the branch adequacy criterion, and would reveal the fault. Intuitively, since
traversing all edges of a graph causes all nodes to be visited, test suites that satisfy the branch adequacy
criterion for a program P also satisfy the statement adequacy criterion for the same program.
CONDITION TESTING
Branch coverage is useful for exercising faults in the way a computation has been decomposed into
cases. Condition coverage considers this decomposition in more detail, forcing exploration not only
of both possible results of a boolean expression controlling a branch, but also of different
combinations of the individual conditions in a compound boolean expression.
Condition adequacy criteria overcome this problem by requiring different basic conditions of the
decisions to be separately exercised.
Basic condition adequacy criterion: requires each basic condition to be covered
A test suite T for a program P covers all basic conditions of P, i.e. it satisfies the basic condition
adequacy criterion, iff each basic condition in P has a true outcome in at least one test case in T and a
false outcome in at least one test in T.
Basic condition coverage (𝐶𝑏𝑎𝑠𝑖𝑐_𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜n) of T for is the fraction of the total no. of truth values assumed by the
basic in T
Branch and condition adequacy criterion: A test suite satisfies the branch and condition adequacy criterion
if it satisfies both the branch adequacy criterion and the condition adequacy criterion. A more complete
extension that includes both the basic condition and the branch adequacy criteria is the compound condition
adequacy criterion, which requires a test for each possible combination of basic conditions.
For ex: the compound condition at line 27 would require covering the three paths in the following tree
Consider the number of cases required for compound condition coverage of the following two Boolean
expressions, each with five basic conditions. For the expression a && b && c && d && e, compound
condition coverage requires:
MCDC
An alternative approach that can be satisfied with the same number of test cases for boolean
expressions of a given length regardless of short-circuit evaluation is the modified condition
adequacy criterion, also known as modified condition / decision coverage or MCDC.
Key idea: Test important combinations of conditions, avoiding exponential blowup
A combination is “important” if each basic condition is shown to independently affect the outcome
of each decision
MC/DC can be satisfied with N+1 test cases, making it attractive compromise b/w no. of required
test cases & thoroughness of the test
Path Testing
• Path Testing is a structural testing method based on the source code or algorithm and NOT
based on the specifications. It can be applied at different levels of granularity.
• Path testing is an approach to testing where you ensure that every path through a program has
been executed at least once.
• The starting point for path testing is a program flow graph.
• A flow graph consists of nodes representing decisions and edges showing flow of control. The
flow graph is constructed by replacing program control statements by equivalent diagrams.
• Each branch in a conditional statement (if-then-else or case) is shown as a separate path. An
arrow looping back to the condition node denotes a loop.
• The objective of path testing is to ensure that each independent path through the program is
executed at least once.
• An independent program path is one that traverses at least one new edge in the flow graph. In
program terms, this means exercising one or more new conditions. Both the true and false
branches of all conditions must be executed.
Path
• A path through a flow graph is considered complete if the first node along the path is considered
complete if the first node along the path is start and the terminating node is END.
• A path p through a flow graph for a program p is considered feasible if there exists at least one
test case which when input to p causes p to be traversed. If no such test case exists, then p is
considered infeasible. Whether a given path p through a program p is feasible is in general an
undecidable problem.
• This statement implies that it is not possible to write an algorithm that takes as inputs an arbitrary
program and a path through that program, and corr
This is a complex definition, so we will apply it to the program graph in the above fig Node 4 is a case 1
DD-path; we will call it “first.” Similarly, node 23 is a case 2 DD-path, and we will call it “last.” Nodes 5
through 8 are case 5 DD-paths. We know that node 8 is the last node in this DD-path because it is the last
node that preserves the 2-connectedness property of the chain. If we go beyond node 8 to include node 9, we
violate the indegree = outdegree = 1 criterion of a chain. If we stop at node 7, we violate the “maximal”
criterion. Nodes 10, 11, 15, 17, 18, and 21 are case 4 DD-paths. Nodes 9, 12, 13, 14, 16, 19, 20, and 22 are
case 3 DD-paths. Finally, node 23 is a case 2 DD-path. All this is summarized in the following table.
The test coverage metrics in Table 8.1 tell us what to test but not how to test it. In this section, we take a
closer look at techniques that exercise source code. We must keep an important distinction in mind: Miller’s
test coverage metrics are based on program graphs in which nodes are full statements, whereas our
formulation allows statement fragments (which can be entire statements) to be nodes. Most quality
organizations now expect the C1 metric as the minimum acceptable level of test coverage. Less adequate, the
statement coverage metric (C0) is still widely accepted: it is mandated by ANSI Standard 1878and has been
used successfully by IBM since the mid-1970s.
DD-Path Testing
When every DD-path is traversed (the C1 metric), we know that each predicate outcome has been executed;
this amounts to traversing every edge in the DD-path graph (or program graph). Therefore, the C 1 metric is
exactly our Gchain metric. For if–then and if–then–else statements, this means that both the true and the false
branches are covered (C1p coverage). For CASE statements, each clause is covered. Beyond this, it is useful
to ask how we might test a DD-path. Longer DD-paths generally represent complex computations, which we
can rightly consider as individual functions. For such DD-paths, it may be appropriate to apply a number of
functional tests, especially those for boundary and special values.
then–else logic, which will result in more DD-paths to cover. Multiple condition coverage assures that this
complexity is not swept under the DD-path coverage rug. This metric has been refined to Modified
Condition Decision Coverage (MCDC),
Loop Coverage
Miller’s Cik metric extends the loop coverage metric to include full paths from source to sink nodes that
contain loops. Loop testing has been studied extensively, and with good reason—loops are a highly fault-
prone portion of source code. loops are simply a sequence of disjoint loops, while nested loops are such that
one is contained inside another. Knotted (Beizer calls them “horrible”) loops cannot occur when the
structured programming precepts are followed, but they can occur in languages like Java with try/catch.
When it is possible to branch into (or out from) the middle of a loop, and these branches are internal to other
loops, the result is Beizer’s knotted loop. We can also take a modified boundary value approach, where the
loop index is given its minimum, nominal, and maximum values.We can push this further to full boundary
value testing and even robustness testing. If the body of a simple loop is a DD-path that performs a complex
calculation, this should also be tested, as discussed previously. Once a loop has been tested, the tester
condenses it into a single node. If loops are nested, this process is repeated starting with the innermost loop
and working outward. This results in the same multiplicity of test cases we found with boundary value
analysis, which makes sense, because each loop index variable acts like an input variable. If loops are
knotted, it will be necessary to carefully analyze them in terms of the data flow methods.
Coverage analyzers are a class of test tools that offer automated support for this approach to testing
management. With a coverage analyzer, the tester runs a set of test cases on a program that has been
“instrumented” by the coverage analyzer. The analyzer then uses information produced by the
instrumentation code to generate a coverage report. In the common case of DD-path coverage, for example,
the instrumentation identifies and labels all DD-paths in an original program. When the instrumented
program is executed with test cases, the analyzer tabulates the DD-paths traversed by each test case.
The objective behind basis path in software testing is that it defines the number of independent paths, thus
the number of test cases needed can be defined explicitly (maximizes the coverage of each test case). Steps
for Basis Path testing
Mathematically, it is set of independent paths through the graph diagram. The Code complexity of the
program can be defined using the formula -
V(G) = E - N + 2
Where,
E - Number of edges
N - Number of Nodes
V (G) = P + 1
Where P = Number of predicate nodes (node that contains condition)
Example -
i = 0;
n=4; //N-Number of nodes present in the graph
while (i<n-1) do
j = i + 1;
while (j<n) do
if A[i]<A[j] then
swap(A[i], A[j]);
end do;
i=i+1;
end do;
Computing mathematically,
V(G) = 9 - 7 + 2 = 4
V(G) = 3 + 1 = 4 (Condition nodes are 1,2 and 3 nodes)
Basis Set - A set of possible execution path of a program
1, 7
1, 2, 6, 1, 7
1, 2, 3, 4, 5, 2, 6, 1, 7
1, 2, 3, 5, 2, 6, 1, 7
V(G)= e-n-2p
=10-7=2(1)=5
The number of linearly independent circuits of the graph on the right side of the graph is
V (G) = e − n + p
= 11− 7 +1 = 5
Figure 3.5 is taken from McCabe (1982). It is a directed graph that we might take to be the program graph
(or the DD-path graph) of some program. Nodes B and C are a loop with two exits, and the edge from B to E
is a branch into the if–then statement in nodes D, E, and F.) The program does have a single entry (A) and a
single exit (G). McCabe based his view of testing on a major result from graph theory, which states that the
cyclomatic number of a strongly connected graph is the number of linearly independent circuits in the graph.
(A circuit is similar to a chain: no internal loops or decisions occur, but the initial node is the terminal node.
A circuit is a set of 3-connected nodes.)
The cyclomatic complexity of the strongly connected graph in Figure 3.5 is 5; thus, there are five linearly
independent circuits. If we now delete the added edge from node G to node A, these five circuits become
five linearly independent paths from node A to node G. In small graphs, we can visually identify
independent paths. Here, we identify paths as sequences of nodes:
p1: A, B, C, G
p2: A, B, C, B, C, G
p3: A, B, E, F, G
p4: A, D, E, F, G
p5: A, D, F, G
Table 3.2 shows the edges traversed by each path, and also the number of times an edge is traversed. We can
force this to begin to look like a vector space by defining notions of addition and scalar multiplication: path
addition is simply one path followed by another path, and multiplication corresponds to repetitions of a path.
With this formulation, McCabe arrives at a vector space of program paths. His illustration of the basis part
of this framework is that the path A, B, C, B, E, F, G is the basis sum p2 + p3 – p1, and the path A, B, C, B,
C, B, C, G is the linear combination 2p2 – p1. It is easier to see this addition with an incidence matrix in
which rows correspond to paths, and columns correspond to edges, as in Table 3.2. The entries in this table
are obtained by following a path and noting which edges are traversed. Path p1, for example, traverses edges
1, 4, and 9, while path p2 traverses the following edge sequence: 1, 4, 3, 4, 9. Because edge 4 is traversed
twice by path p2, that is the entry for the edge 4 column. Path p1 is independent of all of these, because any
attempt to express p1 in terms of the others introduces unwanted edges. None can be deleted, and these five
paths span the set of all paths from node A to node G. At this point, you might check the linear combinations
of the two example paths.
Path/Edges Traversed 1 2 3 4 5 6 7 8 9 10
P1: A, B, C, G 1 0 0 1 0 0 0 0 1 0
P2: A, B, C, B, C, G 1 0 1 2 0 0 0 0 1 0
P3: A, B, E, F, G 1 0 0 0 1 0 0 1 0 1
P4: A, D, E, F, G 0 1 0 0 0 1 0 1 0 1
P5: A, D, F, G 0 1 0 0 0 0 1 0 0 1
ex1: A, B, C, B, E, F, G 1 0 1 1 1 0 0 1 0 1
ex2: A, B, C, B, C, B, C, G 1 0 2 3 0 0 0 0 1 0
McCabe next develops an algorithmic procedure (called the baseline method) to determine a set of basis
paths. The method begins with the selection of a baseline path, which should correspond to some “normal
case” program execution. This can be somewhat arbitrary; McCabe advises choosing a path with as many
decision nodes as possible. Next, the baseline path is retraced, and in turn each decision is “flipped”; that is,
when a node of outdegree ≥ 2 is reached, a different edge must be taken. Here we follow McCabe’s
example, in which he first postulates the path through nodes A, B, C, B, E, F, G as the baseline. (This was
expressed in terms of paths p1 – p5 earlier.) The first decision node (outdegree ≥ 2) in this path is node A;
thus, for the next basis path, we traverse edge 2 instead of edge 1. We get the path A, D, E, F, G, where we
retrace nodes E, F, G in path 1 to be as minimally different as possible. For the next path, we can follow the
second path, and take the other decision outcome of node D, which gives us the path A, D, F, G. Now, only
decision nodes B and C have not been flipped; doing so yields the last two basis paths, A, B, E, F, G and A,
B, C, G. Notice that this set of basis paths is distinct from the one in Table 3.3: this is not problematic
because a unique basis is not required.
Essential Complexity
The basic idea is to look for the graph of one of the structured programming constructs, collapse it into a
single node, and repeat until no more structured programming constructs can be found. This process is
followed in Figure 8.12, which starts with the DD-path graph of the pseudocode triangle program. The if–
then–else construct involving nodes B, C, D, and E is condensed into node a, and then the three if–then
constructs are condensed onto nodes b, c, and d. The remaining if–then–else (which corresponds to the IF
IsATriangle statement) is condensed into node e, resulting in a condensed graph with cyclomatic complexity
V(G) = 1. In general, when a program is well structured (i.e., is composed solely of the structured
programming constructs), it can always be reduced to a graph with one path.
The bottom line for testers is this: programs with high cyclomatic complexity require more testing. Of the
organizations that use the cyclomatic complexity metric, most set some guideline for maximum acceptable
complexity; V(G) = 10 is a common choice. What happens if a unit has a higher complexity? Two
possibilities: either simplify the unit or plan to do more testing. If the unit is well structured, its essential
complexity is 1; so it can be simplified easily. If the unit has an essential complexity greater than 1, often the
best choice is to eliminate the violations.
Test cases which exercise basis set will execute every statement in a program at least once
To illustrate the approach of data flow testing, assume that each statement in the program assigned a unique
statement number. For a statement number S-
If a statement is a loop or if condition then its DEF set is empty and USE set is based on the condition of
statement s.
Data Flow Testing uses the control flow graph to find the situations that can interrupt the flow of the
program. Reference or define anomalies in the flow of the data are detected at the time of associations
between values and variables. These anomalies are:
The following definitions refer to a program P that has a program graph G(P) and a set of program variables
V. The program graph G(P) is constructed as in Chapter 4, with statement fragments as nodes and edges that
represent node sequences. G(P) has a single-entry node and a single-exit node. We also disallow edges from
a node to itself. Paths, subpaths, and cycles. The set of all paths in P is PATHS(P).
Definition: Node n ∈ G(P) is a defining node of the variable v ∈ V, written as DEF(v, n), if and only if the
value of variable v is defined as the statement fragment corresponding to node n. Input statements,
assignment statements, loop control statements, and procedure calls are all examples of statements that are
defining nodes. When the code corresponding to such statements executes, the contents of the memory
location(s) associated with the variables are changed.
Definition: Node n ∈ G(P) is a usage node of the variable v ∈ V, written as USE(v, n), if and only if the
value of the variable v is used as the statement fragment corresponding to node n. Output statements,
assignment statements, conditional statements, loop control statements, and procedure calls are all examples
of statements that are usage nodes. When the code corresponding to such statements executes, the contents
of the memory location(s) associated with the variables remain unchanged.
Definition: A usage node USE(v, n) is a predicate use (denoted as P-use) if and only if the statement n is a
predicate statement; otherwise, USE(v, n) is a computation use (denoted C-use). The nodes corresponding to
predicate uses always have an outdegree ≥ 2, and nodes corresponding to computation uses always have an
outdegree ≤ 1.
Chayashree G,, ISE, NIEIT, MYSURU Page 20
Software Testing, VI sem 17IS63
Definition: A definition/use path with respect to a variable v (denoted du-path) is a path in PATHS(P) such
that, for some v ∈ V, there are define and usage nodes DEF(v, m) and USE(v, n) such that m and n are the
initial and final nodes of the path.
Definition: A definition-clear path with respect to a variable v (denoted dc-path) is a definition/use path in
PATHS(P) with initial and final nodes DEF(v, m) and USE(v, n) such that no other node in the path is a
defining node of v.
[you tube link about data flow testing: https://www.youtube.com/watch?v=LRlToFzm3lg]
Du-paths
Stocks: First, let us look at a simple path: the du-path for the variable stocks. DEF(stocks, 15) and
USE(stocks, 17), so the path <15, 17> is a du-path with respect to stocks. No other defining nodes are used
for stocks; therefore, this path is also definition clear.
Locks: Two defining and two usage nodes make the locks variable more interesting:
DEF(locks, 13), DEF(locks, 19), USE(locks, 14), and USE(locks, 16).
These yield four du-paths; they are
shown in Figure 3.9.
p1 = <13, 14>
p2 = <13, 14, 15, 16>
p3 = <19, 20, 14>
p4 = <19, 20, 14, 15, 16>
Note: du-paths p1 and p2 refer to the priming value of locks, which is read at node 13. The locks variable
has a predicate use in the while statement (node 14), and if the condition is true (as in path p2), a
computation use at statement 16. The other two du-paths start near the end of the while loop and occur when
the loop repeats. These paths provide the loop coverage bypass the loop, begin the loop, repeat the loop, and
exit the loop. All these du-paths are definition clear.
Table 3.5 lists the define and usage nodes for the variables in the commission problem. We use this
information in conjunction with the program graph in Figure 3.8 to identify various definition/ use and
definition-clear paths. It is a judgment call whether non executable statements such as constant and variable
declaration statements should be considered as defining nodes. Such nodes are not very interesting when we
follow what happens along their du-paths; but if something is wrong, it can be helpful to include them. We
will refer to the various paths as sequences of node numbers. Tables 3.6 and 3.7 present some of the du-
paths in the commission problem; they are named by their beginning and ending nodes (from Figure 3.8).
The third column in Table 3.6 indicates whether the du-paths are definition clear. Some of the du-paths are
trivial—for example, those for lockPrice, stockPrice, and barrelPrice. Others are more complex: the while
loop (node sequence <14, 15, 16, 17, 18, 19, 20>) inputs and accumulated values for total Locks,
totalStocks, and total- Barrels. Table 9.3 only shows the details for the totalStocks variable. The initial value
definition for totalStocks occurs at node 11, and it is first used at node 17. Thus, the path (11, 17), which
consists of the node sequence <11, 12, 13, 14, 15, 16, 17>, is definition clear. The path (11, 22), which
consists of the node sequence <11, 12, 13, (14, 15, 16, 17, 18, 19, 20)*, 21, 22>, is not definition clear
because values of totalStocks are defined at node 11 and (possibly several times at) node 17. (The asterisk
after the while loop is the Kleene Star notation used both in formal logic and regular expressions to denote
zero or more repetitions.)
totalLocks: The du-paths for totalLocks will lead us to typical test cases for computations. With two
defining nodes (DEF(totalLocks, 10) and DEF(totalLocks, 16)) and three usage nodes (USE(totalLocks, 16),
USE(totalLocks, 21), USE(totalLocks, 24)), we might expect six du-paths. Let us take a closer look. Path p5
= <10, 11, 12, 13, 14, 15, 16> is a du-path in which the initial value of totalLocks (0) has a computation use.
This path is definition clear. The next path is problematic:
p6 = <10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 14, 21>
If a problem occurs with the value of totalLocks at node 21 (the Output statement), we should look at the
intervening DEF(totalLocks, 16) node. The next path contains p6; we can show this by using a path name in
place of its corresponding node sequence:
p7 = <10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 14, 21, 22, 23, 24>
p7 = <p6, 22, 23, 24>
Du-path p7 is not definition clear because it includes node 16. Subpaths that begin with node 16 (an
assignment statement) are interesting. The first, <16, 16>, seems degenerate. If we “expanded” it into
machine code, we would be able to separate the define and usage portions. We will disallow these as du-
paths. Technically, the usage on the right-hand side of the assignment refers to a value defined at node 10
(see path p5). The remaining two du-paths are both subpaths of p7:
p7 = <10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 14, 21, 22, 23, 24>
p7 = <p6, 22, 23, 24>
Du-path p7 is not definition clear because it includes node 16. Subpaths that begin with node 16 (an
assignment statement) are interesting. The first, <16, 16>, seems degenerate. If we “expanded” it into
machine code, we would be able to separate the define and usage portions. We will disallow these as du-
paths. Technically, the usage on the right-hand side of the assignment refers to a value defined at node 10
(see path p5). The remaining two du-paths are both subpaths of p7:
Sales: There is one defining node for sales; therefore, all the du-paths with respect to sales must be
definition clear. The first three du-paths are easy:
p10 = <27, 28>
p11 = <27, 28, 29>
p12 = <27, 28, 29, 30, 31, 32, 33>
Notice that p12 is a definition-clear path with three usage nodes; it also contains paths p10 and p11. If we
were testing with p12, we know we would also have covered the other two paths. The IF, ELSE IF logic in
statements 29 through 40 highlights an ambiguity in the original research. Two choices for du-paths begin
with path p11: one choice is the path <27, 28, 29, 30, 31, 32, 33>, and the other is the path <27, 28, 29, 34>.
The remaining du-paths for sales are
p13 = <27, 28, 29, 34>
p14 = <27, 28, 29, 34, 35, 36, 37>
p15 = <27, 28, 29, 34, 38>
Commission
In statements 29 through 41, the calculation of commission is controlled by ranges of the variable sales.
Statements 31 to 33 build up the value of commission by using the memory location to hold intermediate
values. This is a common programming practice, and it is desirable because it shows how the final value is
computed. (We could replace these lines with the statement “commission: = 220 + 0.20 * (sales –1800),”
where 220 is the value of 0.10 * 1000 + 0.15 * 800, but this would be hard for a maintainer to understand.)
We decided to disallow du-paths from assignment statements like 31 and 32, so we will just consider du-
paths that begin with the three “real”defining nodes: DEF(commission, 33), DEF(commission, 37), and
DEF(commission, 39). Only one usage node is used: USE(commission, 41).
Definition: The set T satisfies the All-DU-paths criterion for the program P if and only if for every variable
v ∈ V, T contains definition-clear paths from every defining node of v to every use of v and to the successor
node of each USE(v, n), and that these paths are either single loop traversals or they are cycle free. These
test coverage metrics have several set-theory-based relationships, which are referred to as “subsumption” in
Rapps and Weyuker (1985). These relationships are shown in Figure 3.11. We now have a more refined
view of structural testing possibilities between the extremes of the (typically unattainable) All-Paths metric
and the generally accepted minimum, All-Edges.
Slice-Based Testing
Given a program P and a set V of variables in P, a slice on the variable set V at statement n, written S(V, n),
is the set of all statement fragments in P that contribute to the values of variables in V at node n. One
simplifying notion—the set V of variables consists of a single variable, v. Extending this to sets of more
than one variable is both obvious and cumbersome. For sets V with more than one variable, we just take the
union of all the slices on the individual variables of V. There are two basic questions about program slices,
whether they are backward or forward slices, and whether they are static or dynamic. Backward slices refer
to statement fragments that contribute to the value of v at statement n. Forward slices refer to all the program
statements that are affected by the value of v and statement n. This is one place where the define/use notions
are helpful.
Definition: Given a program P and a program graph G(P) in which statements and statement fragments
are numbered, and a set V of variables in P, the static, backward slice on the variable set V at statement
fragment n, written S(V, n), is the set of node numbers of all statement fragments in P that contribute to the
values of variables in V at statement fragment n.
The idea of program slicing is to separate a program into components that have some useful (functional)
meaning. Another refinement is whether or not a program slice is executable. Adding all the data declaration
statements and other syntactically necessary statements clearly increases the size of a slice, but the full
version can be compiled and separately executed and tested. Further, such compilable slices can be spliced”
together (Gallagher and Lyle, 1991) as a bottom–up way to develop a program. As a test of clear diction,
Gallagher and Lyle suggest the term “slice splicing.”
The notion of contribution is partially clarified by the predicate (P-use) and computation (C-use) usage
distinction of Rapps and Weyuker (1985), but we need to refine these forms of variable usage. Specifically,
the USE relationship pertains to five forms of usage:
P-use used in a predicate (decision)
C-use used in computation
O-use used for output
L-use used for location (pointers, subscripts)
I-use iteration (internal counters, loop indices)
Most of the literature on program slices just uses P-uses and C-uses. While we are at it, we identify two
forms of definition nodes:
I-def defined by input
A-def defined by assignment
If statement fragment n is a defining node for v, then n is included in the slice. If statement fragment n is a
usage node for v, then n is not included in the slice. If a statement is both a defining and a usage node, then it
is included in the slice. In a static slice, P-uses and C-uses of other variables (not the v in the slice set V ) are
included to the extent that their execution affects the value of the variable v.
Example: The examples refer to the source code for the commission problem .There are 42 “interesting”
static backward slices in our example. They are named in Table 3.8. We will take a selective look at some
interesting slices. The first six slices are the simplest—they are the nodes where variables are initialized.
Fig 3.12 Full lattice on commission. Fig 3.13 partial lattice of slice on commision
Slice Splicing
The commission program is deliberately small, yet it suffices to illustrate the idea of “slice splicing.” In
Figures 3.14 through 3.17, the commission program is split into four slices. Slice 1 contains the input while
loop controlled by the locks variable. This is a good starting point because both Slice 2 and Slice 3 use the
loop to get input values for stocks and barrels, respectively. Slices 1, 2, and 3 each culminate in a value of
sales, which is the starting point for Slice 4, which computes the commission bases on the value of sales.
This is overkill for this small example; however, the idea extends perfectly to larger programs. It also
illustrates the basis for program comprehension needed in software maintenance. Slices allow the
maintenance programmer to focus on the issues at hand and avoid the extraneous information that would be
in du-paths.
Fig3.16 slice 3