Coverage Metrics For Requirements-Based Testing: January 2006

Uploaded by

CORAL ALONSO JIMÉNEZ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

Coverage Metrics For Requirements-Based Testing: January 2006

Uploaded by

CORAL ALONSO JIMÉNEZ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220854390

Coverage metrics for requirements-based testing

Conference Paper · January 2006

DOI: 10.1145/1146238.1146242 · Source: DBLP

CITATIONS READS
121 2,526

4 authors:

Michael William Whalen Ajitha Rajan

Amazon The University of Edinburgh
114 PUBLICATIONS 1,923 CITATIONS 28 PUBLICATIONS 408 CITATIONS

SEE PROFILE SEE PROFILE

Mats P. E. Heimdahl Steven P. Miller

University of Minnesota Twin Cities Rockwell Collins
167 PUBLICATIONS 4,367 CITATIONS 29 PUBLICATIONS 873 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Contract-Based Black-Box Assurance View project

DARPA HACMS View project

All content following this page was uploaded by Michael William Whalen on 15 May 2014.

The user has requested enhancement of the downloaded file.

Coverage Metrics for Requirements-Based Testing ∗

Michael W. Whalen Ajitha Rajan

Advanced Technology Center Dept. of Comp. Sci. and Eng.
Rockwell Collins Inc. University of Minnesota
mwwhalen@rockwellcollins.com arajan@cs.umn.edu

Mats P.E. Heimdahl Steven P. Miller

Dept. of Comp. Sci. and Eng. Advanced Technology Center
University of Minnesota Rockwell Collins Inc.
heimdahl@cs.umn.edu spmiller@rockwellcollins.com

ABSTRACT 1. INTRODUCTION
In black-box testing, one is interested in creating a suite In black-box testing, one is interested in creating a suite
of tests from requirements that adequately exercise the be- of tests from requirements that adequately exercise the be-
havior of a software system without regard to the internal havior of a software system without regard to the internal
structure of the implementation. In current practice, the structure of the implementation. This is the preferred style
adequacy of black box test suites is inferred by examining of testing in safety-critical domains and is advocated by
coverage on an executable artifact, either source code or a standards such as DO-178B [27]. Currently, there is no ob-
software model. jective standard for directly determining the adequacy of a
black-box test suite given a set of requirements. Instead,
In this paper, we define structural coverage metrics di-
the adequacy of such suites are inferred by examining differ-
rectly on high-level formal software requirements. These
ent coverage metrics on an executable artifact, either source
metrics provide objective, implementation-independent mea-
code [4, 5] or software models [1, 22].
sures of how well a black-box test suite exercises a set
of requirements. We focus on structural coverage crite- There are several problems with using the executable ar-
ria on requirements formalized as LTL properties and dis- tifacts to measure the adequacy of black-box tests. First, it
cuss how they can be adapted to measure finite test cases. is an indirect measure: if an implementation is missing func-
These criteria can also be used to automatically generate a tionality, a weak set of black-box tests may yield structural
requirements-based test suite. Unlike model or code-derived coverage of the implementation and yet not expose defects of
test cases, these tests are immediately traceable to high-level omission. Conversely, if black-box tests yield poor coverage
requirements. To assess the practicality of our approach, we of an implementation, an analyst must determine whether
apply it on a realistic example from the avionics domain. it is because (a) there are missing or implicit requirements,
(b) there is code in the implementation that is not derived
Categories and Subject Descriptors from the requirements, or (c) the set of tests derived from
the requirements was inadequate. Finally, an executable ar-
D.2.4 [Software Engineering]: Software/Program Verifi- tifact is necessary to measure the adequacy of the test suite.
cation This may mean that the adequacy of a test suite cannot be
determined until late in the development process.
General Terms Generally, requirements are defined informally as, for ex-
Verification ample, “shall” statements or use-cases. However, recent ef-
forts (e.g., [19]) have formalized and verified software re-
quirements using notations such as temporal logics [9] and
synchronous observers [12]. Given formal requirements, it
is possible to define meaningful coverage metrics directly on
∗
This work has been partially supported by NASA Lang- the structure of the requirements.
ley’s Aviation Safety and Security Program (AvSSP) under
contract NCC-01001. In this paper, we present coverage metrics on formal high-
level software requirements. These metrics are desirable
because they provide objective, implementation-independent
measures of how well a black-box test suite exercises a set of
Permission to make digital or hard copies of all or part of this work for requirements. Further, given a set of test cases that achieve
personal or classroom use is granted without fee provided that copies are a certain level of structural coverage of the high-level re-
not made or distributed for profit or commercial advantage and that copies quirements, it is possible to measure model or code cover-
bear this notice and the full citation on the first page. To copy otherwise, to age to objectively assess whether the high-level requirements
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
have been sufficiently defined for the system. This approach
ISSTA’06, July 17–20, 2006, Portland, Maine, USA. yields several objective measurements that are not possible
Copyright 2006 ACM 1-59593-263-1/06/0007 ...$5.00.

25
with traditional testing techniques, and integrates and cross- “If the onside FD cues are off, the onside FD cues
checks several of the validation and verification activities. shall be displayed when the AP is engaged.”
In addition, using the coverage metrics and an executable (a)
formal model, it is possible to autogenerate requirements-
based test cases from properties. This idea has also been G((¬Onside F D On ∧ ¬Is AP Engaged) →
explored by Tan et al. [28], and in several efforts at the X(Is AP Engaged → Onside F D On))
model and/or source code level [1, 24, 11]. The benefit from (b)
generating tests from properties as opposed to models or
code is that the generated test cases can be immediately Table 1: (a) Sample high-level requirement on the
traced back to a high-level requirement of interest. This FGS (b) LTL property for the requirement
aspect may be helpful to satisfy the testing guidelines of
rigorous development processes such as DO-178B [27]. for the FGS were translated into 293 properties in Linear
The idea of property metrics has also been explored theo- Time Temporal Logic (LTL) [9] and verified over an imple-
retically by Tan et al. [28], using metrics based on vacuity [3, mentation model of the mode logic using the NuSMV model
16, 21]. Our work provides new metrics which we believe to checker. In most cases, as in Table 1, the LTL property is
be more practical for measuring coverage given black-box very similar in structure to the natural language requirement
tests and discusses some of the implications of property- and the translation was straightforward.
based testing. To evaluate the requirements coverage met- An analyst developing test cases from the informal re-
rics and the practicality of autogenerating requirements- quirements might derive the scenario in Table 2 to demon-
based tests, we have implemented a property-based test case strate that the requirement is met. Does this adequately
generator and applied our techniques to a realistic system cover this high-level requirement? Does passing such a test
from the civil avionics domain. case indicate that the model has correctly captured the be-
The remainder of the paper is organized as follows. Sec- havior required through this requirement? If not, what
tion 2 presents a case example of the flight guidance sys- would additional test cases look like? The specification of
tem we use to illustrate our ideas. Our contributions to the requirement as a property allows us to define several
requirements-based testing and the test coverage criteria objective criteria with which to determine whether we have
used are introduced in Section 3. Section 4 discusses how adequately tested the requirement. We hypothesize that
we generate requirements-based tests for the FGS example coverage of such criteria can serve as a reliable measure of
and the results obtained from running the tests. Related the thoroughness of the requirements-based testing activi-
work is discussed in Section 5. Finally, Section 6 discusses ties.
the implications of the results and points to future work in
requirements-based test case generation. 1. Turn the Onside FD off
2. Disengage the AP
3. Engage the AP
2. THE FLIGHT GUIDANCE SYSTEM 4. Verify that the Onside FD comes on
To illustrate our approach we will use an example from
commercial avionics—a Flight Guidance System (FGS). A
Flight Guidance System is a component of the overall Flight Table 2: Manually developed requirements-based
Control System (FCS) in a commercial aircraft. It compares test scenario
the measured state of an aircraft (position, speed, and al-
titude) to the desired state and generates pitch and roll- To define suitable coverage metrics, it is possible to adapt
guidance commands to minimize the difference between the several existing code and modeling coverage metrics, for ex-
measured and desired state. The FGS consists of the mode ample, [26, 23, 20, 11]. A distinction must be made, how-
logic, which determines which lateral and vertical modes ever, between coverage metrics over source code and cov-
of operation are active and armed at any given time, and erage metrics over requirements. Metrics over code assume
the flight control laws that accept information about the that Boolean expressions can take on both ‘true’ and ‘false’
aircraft’s current and desired state and compute the pitch values. When generating tests from requirements, we usu-
and roll guidance commands. In this paper we focus on the ally are interested in test cases exercising the different ways
mode logic of the FGS. The requirements and implementa- of satisfying a requirement (i.e., showing that it is true).
tion model used in this paper are described in [18] and are Test cases that presume the requirement is ‘false’ are not
similar to production systems created by Rockwell Collins particularly interesting; this is discussed in further detail in
Inc. Section 3.1.2.
In this paper, we focus on structural coverage metrics over
LTL. Nevertheless, there are several other notations that can
3. REQUIREMENTS-BASED TESTING be used to describe high-level requirements. For example,
There is a close relationship between the high-level re- SCADE [29] and Reactis [25] use synchronous observers [12].
quirements and the properties captured for verification pur- Synchronous observers are small specifications of high-level
poses. As an example, consider the requirement from the requirements written as state machines or in the same nota-
sample FGS shown in Table 1 defining how the Flight Di- tion as the software specification, and they run “in parallel”
rector (FD) is turned on by the Autopilot (AP). The prop- with the model. The notion of requirements coverage can
erty states that it is globally true (G) that if the Onside FD now be viewed as structural coverage over the observer. We
is not on and the AP is not engaged, in the next instance will consider requirements formulated as synchronous ob-
in time (X) if the AP is engaged, then the Onside FD will servers and the notion of requirements coverage applied to
also be on. In [19], the full set of informal requirements them in our future work.

26
3.1 Structural Coverage over LTL Syntax Each expression is designed to show whether a particular
Structural coverage criteria defined for source code and for condition positively or negatively affects the outcome of a
executable modeling languages can be adapted to fit propo- decision. That is, if the expression is true, then the corre-
sitional temporal logics such as CTL and LTL. For instance, sponding condition is guaranteed to affect the outcome of
decision coverage of the requirements would require a single the decision. Given a decision A, we define A+ to be the set
test case that demonstrates that the requirement is satisfied, of expressions necessary to show that all of the conditions in
such as the manually developed one in Table 2. Typically, A positively affect the outcome of A, and A− to be the set of
this single positive test case is too weak of a coverage since expressions necessary to show that the all of the conditions
it is often possible to derive a useless test case. If we again in A negatively affect the outcome of A.
consider our sample requirement and formalization from Ta- We can define A+ and A− schematically over the structure
ble 1, we can satisfy the decision coverage metric by creating of complex decisions as follows:
a test case that leaves the autopilot disengaged throughout x+ = {x} (where x is a basic condition)
the test and disregards the behavior of the flight director. x− = {¬x} (where x is a basic condition)
Although this test case technically satisfies the property, it
does not shed much light on the correctness of our model. The positive and negative test cases for conditions are
A better alternative would be to adopt one of the more rig- simply the singleton sets containing the condition and
orous structural coverage criteria such as vacuity coverage [3] its negation, respectively.
or the Modified Condition/Decision Coverage (MC/DC) cri-
terion [5, 13], for use in the requirements-based testing do- (A ∧ B)+ = {a ∧ B | a ∈ A+ } ∪ {A ∧ b | b ∈ B + }
main. In this section, we define a requirements coverage (A ∧ B)− = {a ∧ B | a ∈ A− } ∪ {A ∧ b | b ∈ B − }
metric called Unique First Cause Coverage (UFC) that is To get positive MC/DC coverage of A ∧ B, we need
adapted from MC/DC criterion. to make sure that every element in A+ uniquely con-
MC/DC is a structural coverage metric that is designed to tributes to making A ∧ B true while holding B true,
demonstrate the independent effect of basic Boolean condi- and symmetrically argue for the elements of B + . The
tions (i.e., subexpressions with no logical operators) on the argument for negative MC/DC coverage is the same,
Boolean decision (expression) in which they occur. A test except we show that A∧B is false by choosing elements
suite is said to satisfy MC/DC if executing the test cases in of A− and B − .
the test suite will guarantee that:
(A ∨ B)+ = {a ∧ ¬B | a ∈ A+ } ∪ {¬A ∧ b | b ∈ B + }
• every point of entry and exit in the model has been (A ∨ B)− = {a ∧ ¬B | a ∈ A− } ∪ {¬A ∧ b | b ∈ B − }
invoked at least once,
To get positive (negative) MC/DC coverage over A ∨
• every basic condition in a decision in the model has B, we need to make sure that every element in A+
taken on all possible outcomes at least once, and (A− ) uniquely contributes to making A∨B true (false)
while holding B false, and the symmetric argument for
• each basic condition has been shown to independently elements of B + (B − ).
affect the decision’s outcome
(¬A)+ = A−
Note that each instance of a condition within a formula is (¬A)− = A+
treated separately: the formula (A ∧ B) ∨ A) has three ba-
The positive and negative MC/DC coverage sets for
sic conditions, and we must determine the independence of
¬A swap the positive and negative obligations for A.
each instance of A separately. In this paper we use masking
MC/DC [13] to determine the independence of conditions. Each of the expressions in the positive and negative sets
In masking MC/DC, a basic condition is masked if vary- can be seen as defining a constraint over a program or model
ing its value cannot affect the outcome of a decision due to state. The process of satisfying MC/DC involves determin-
structure of the decision and the value of other conditions. ing whether each of these constraints is satisfied by some
To satisfy masking MC/DC for a basic condition, we must state that is reached by a test within a test suite.
have test states in which the condition is not masked and
takes on both ‘true’ and ‘false’ values. 3.1.2 Unique-First-Cause (UFC) Coverage
Since requirements captured as LTL properties define paths
3.1.1 MC/DC Coverage of Decisions rather than states, we broaden our view of structural cover-
In masking MC/DC, the masking criteria are defined over age to accommodate satisfying paths rather than satisfying
Boolean operators (and, in the case of imperative programs, states. The idea is to measure whether we have sufficient
loops and selection statements). For example, given the tests to show that all atomic conditions within the property
expression A ∧ B, to show the independence of B, we must affect the outcome of the property. We can define these test
hold the value of A to true; otherwise varying B will not paths by extending the constraints for state-based MC/DC
affect the outcome of the expression. When we consider to include temporal operators. These operators describe the
decisions with multiple Boolean operators, we must ensure path constraints required to reach an acceptable state. The
that the test results for one operator are not masked out idea is to characterize a trace π = s0 → s1 → . . . in which
by the behavior of other operators. For example, given A ∨ the formula holds for states s0 . . . sk−1 , then passes through
(B ∧ C) the tests for B ∧ C will not affect the outcome of state sk , in which the truth or falsehood of the formula is
the decision if A is true. determined by the atomic condition of interest. For satis-
It is straightforward to describe the set of required MC/DC fying traces, we require that the formula continue to hold
assignments for a decision as a set of Boolean expressions. thereafter.

27
A test suite is said to satisfy UFC coverage over a set of have to ensure that every element in A+ contributes
LTL formulas if executing the test cases in the test suite will to making A true along the path and every element in
guarantee that: B + contributes to completing the formula.
• every basic condition in a formula has taken on all The formula to the left of the union provides positive
possible outcomes at least once UFC over A in (A U B). Recall that an ‘until’ formula
• each basic condition has been shown to independently is immediately satisfied if B holds. Therefore, in order
affect the formula’s outcome. to show that some specific atom in A (isolated by the
We define independence in terms of the shortest satisfying formula a) affects the outcome, we need to show that
path for the formula. Thus, if we have a formula A and a this atom is necessary before B holds. This is accom-
path π, an atom α in A is the unique first cause if, in the first plished by describing the prefix of the path in which
state along π in which A is satisfied, it is satisfied because a affects the outcome as: (A ∧ ¬B) U (a ∧ ¬B). In
of atom α. To make this notion concrete, suppose we have order to ensure that our prefix is sound, we still want
the formula F(a ∨ b) and a path P = s0 → s1 → . . . in B to eventually hold, we add (A U B) to complete the
which a was initially true in step s2 and b was true in step formula.
s5 . For path P , a (but not b) would satisfy the unique first The formula on the right of the union is similar and
cause obligation. It is possible to generalize this definition asserts that some b ∈ B is the unique first cause for B
to unique cause, which states that an atom α is necessary to be satisfied.
for the satisfaction of A, but space concerns prevent us from
covering both formulations in this paper. (A U B)− =
The definition above takes the evenhanded view that all {(A ∧ ¬B) U ((a ∧ ¬B) | a ∈ A− } ∪
formulas will be satisfiable, that is, there are some traces {(A ∧ ¬B) U (b ∧ ¬(A U B)) | b ∈ B − }
in which the property is true and some in which it is false. For the formula A U B to be falsified, there is some
Nevertheless, we eventually want a system in which all of state in which A is false before B is true. The formula
the properties (requirements) are valid. Therefore, we are to the left of the union demonstrates that a ∈ A−
primarily concerned with the positive set of test cases for uniquely contributes to the falsehood of the formula
a formula. It is necessary, however, to define the negative by describing a path in which A holds (and B does
UFC constraint sets since when an LTL formula, say f , is not — otherwise the formula A U B would be true)
negated, the positive UFC constraints of ¬f is evaluated as until a state in which both a and B are false.
the negative UFC constraints of f .
For the formalization of UFC coverage of requirements The formula to the right demonstrates that b ∈ B −
expressed as LTL properties we will use the notational con- uniquely contributes to the falsehood of the formula
ventions that were defined above for Boolean expressions by describing a path in which A U B eventually fails,
and extend them to include temporal operators in LTL. but A holds long enough to contain state b falsifying
We extend A+ and A− defined over states to define sat- B.
isfying paths over LTL temporal operators as follows: The formulations above define a rigorous notion of re-
G(A)+ = {A U (a ∧ G(A)) | a ∈ A+ } quirements coverage over execution traces. Since we have
F(A)− = {¬A U (a ∧ G(¬A)) | a ∈ A− } been working with linear time temporal logic, the defini-
G(A) is true if A is true along all states within a path. tions apply over infinite traces. Naturally, test cases must
The A U (a ∧ G(A)) formula ensures that each el- by necessity be finite; therefore, the notion of requirements
ement a in A+ contributes to making A true at some coverage must apply to finite traces.
state along a path in which A is globally true. 3.1.3 Adapting Formulas to Finite Tests
F(A)− is the dual of G(A)+ , so the obligations match LTL is normally formulated over infinite paths while test
after negating A and a. cases correspond to finite paths. Nevertheless, the notion of
F(A)+ = {¬A U a | a ∈ A+ } coverage defined in the previous section can be straightfor-
G(A)− = {A U a | a ∈ A− } wardly adapted to consider finite paths as well. There is a
The independent effect of a ∈ A+ for the F(A) formula growing body of research in using LTL formulas as monitors
is demonstrated by showing that it is the first cause during testing [17, 10, 2], and we can adapt these ideas to
for A to be satisfied. Similar to the previous definition, check whether a test suite has sufficiently covered a property.
G(A)− is the dual of F (A)+ . Manna and Pnueli [17] define LTL over incomplete mod-
els, that is, models in which some states do not have succes-
X(A)+ = {X(a) | a ∈ A+ } sor states. In this work, the operators are given a best-effort
X(A)− = {X(a) | a ∈ A− } semantics, that is, a formula holds if all evidence along the
The independent effects of a ∈ A+ (resp. a ∈ A− ) are finite path supports the truth of the formula. The most
demonstrated by showing that they affect the formula significant consequence of this formulation is that the next
in the next state. operator (X) is split into two operators: X! and X, which
are strong and weak next state operators, respectively. The
(A U B)+ = strong operator is always false on the last state in a finite
{(A ∧ ¬B) U ((a ∧ ¬B) ∧ (A U B)) | a ∈ A+ } ∪ path, while the weak operator is always true.
{(A ∧ ¬B) U b | b ∈ B + } It is straightforward to define a formal semantics for LTL
For the formula A U B to hold, A must hold in all over finite paths. We assume that a state is a labeling L :
states until we reach a state where B holds. There- V → {T, F } for a finite set of variables V , and that a finite
fore, positive UFC coverage for this would mean we path π of length k is a sequence of states s1 → s2 → ... → sk .

28
1. π true isfy the formula as a whole. The strong semantics always fail
2. π p iff |π| > 0 and p ∈ L(s1 ) on G operators, and therefore disallow finite paths if there
3. π ¬A iff π A is any doubt as to whether the stated formula is satisfied.
4. π (A ∧ B) iff π A and π B Since we believe that the test cases in Table 4 adequately
5. π (A ∨ B) iff π A or π B illustrate the independence of a and b, we slightly weaken
6. π X(A) iff |π| ≤ 1 or π 2 A our LTL obligations. Given a formula f , we are interested
7. π X!(A) iff |π| > 1 and π 2 A in a prefix of an accepting path for f that is long enough to
8. π G(A) iff ∀1 ≤ i ≤ |π| π i A demonstrate the independence of our condition of interest.
9. π F (A) iff for some 1 ≤ i ≤ |π| π i A Thus, we want the operators leading to this demonstration
10. π A U B iff there is some 1 ≤ i ≤ |π| state to be neutral 1 , but the operators afterwards can be
where π i B and ∀j = 1..(i − 1), π j A weak.
The strong and weak semantics are a coupled dual pair be-
Table 3: Semantics of LTL over Finite Paths cause the negation operator switches between them. In [10],
the semantics are provided as variant re-formulations of the
Test1:
Atom Step1 Step2 Step3 neutral semantics. However, they can also be described as
a t t f syntactic transformations of neutral formulas that can then
b t f t be checked using the neutral semantics. We define weak [F ]
c f f f to be the weakening of a formula F and strong[F ] to be the
strengthening of formula F . The transformations weak and
strong are defined in Table 6. We refer the reader to [10] for
Test2:
Atom Step1 Step2 a full description of the three semantics and their effect on
a t t provability within the defined logic.
b t f Given these transformations, we can re-formulate the nec-
c f t essary UFC paths in LTL. The idea is that we want a prefix
of a satisfying path that conclusively demonstrates that a
Table 4: Test Suite for Property ((a ∨ b) U c) particular condition affects the outcome of the formula. To
create such a prefix, we want a neutral formula up to the
We write π i for the suffix of π starting with state si , and the state that demonstrates the atomic condition and a weak
length of the path as |π|. Given these definitions, the formal formula thereafter. The modified formulas defining UFC
semantics of LTL over finite paths is defined in Table 3. over finite prefixes are shown in Table 7.
As expected, these definitions correspond with the standard The only formulas that are changed in Table 7 from the
semantics except that they do not require that G properties original formulation in Section 3.1.2 are G(A)+ , F (A)− , one
hold infinitely (only over the length of the finite path), and branch of (A U B)+ , and one branch of (A U B)− . These
do not require X properties to hold in the last state of a are the formulas that have additional obligations to match
finite path. a prefix of an accepting path after showing how the focus
The semantics in Table 3 are sensible and easy to under- condition affects the path.
stand but may be too strong for measuring test coverage.
We may want to consider tests that show the independence 3.1.4 Discussion
of one of the atoms even if they are “too short” to discharge It is possible to take two views when measuring the cov-
all of the temporal logic obligations for the original property. erage of requirements from a given test suite. The first
For example, consider the formula: perspective states that each test case must have sufficient
evidence to demonstrate that the formula of interest is true
((a ∨ b) U c)
and that a condition of interest affects the outcome of the
and the test cases in Table 4. Are these two test cases suf- formula. This perspective can be achieved using the neutral
ficient to show the independent effects of a, b, and c? From finite LTL rules and our original property formulation.
one perspective, test 1 is (potentially) a prefix of a path that The second perspective states that each test case is a prefix
satisfies ((a ∨ b) U c) and independently shows that a and of an accepting path for the formula of interest and that
b affect the outcome of the formula; the test case illustrates the condition of interest affects the outcome of the formula.
that the formula holds with only a or only b being true. This perspective can be achieved using the weakened UFC
Test 2 shows the independent effect of c. From another per- obligations shown in Table 7.
spective (the perspective of the finite semantics described Making this discussion concrete, given our example for-
above), test 1 does not satisfy the formula (since the finite mula:
semantics in Table 3 requires that for the until formula to f = ((a ∨ b) U c)
hold, c must become true in the path), so cannot be used to the UFC obligations for the original and modified rules are
show the independent effect of any of the atoms. shown in Table 5. In the original formulation, the first two
The issue with these tests (and with finite paths in gen- obligations are not satisfied by the test suite in Table 4 be-
eral) is that there may be doubt as to whether the property cause c never becomes true in the first test case. In the
as a whole will hold. This issue is explored in [10], which de- weakened formulation, however, the requirement is covered
fines three different semantics for temporal operators: weak, because the first test case is a potential prefix of an accepting
neutral, and strong. The neutral semantics are the semantics path.
of [17] described in Table 3. The weak semantics do not re-
quire eventualities (F and the right side of U) to hold along 1
The strong semantics are too strong – any property con-
a finite path, and so describe prefixes of paths that may sat- taining a G-operator will be disproved.

29
1. weak [true] = true 12. strong [true] = true
2. weak [p] = p 13. strong [p] = p
3. weak [¬A] = ¬strong [A] 14. strong [¬A] = ¬ weak [A]
4. weak [A ∧ B] = weak [A] ∧ weak [B] 15. strong [A ∧ B] = strong [A] ∧ strong [B]
5. weak [A ∨ B] = weak [A] ∨ weak [B] 16. strong [A ∨ B] = strong [A] ∨ strong [B]
6. weak [X!(A)] = X(weak [A]) 17. strong [X!(A)] = X!(strong [A])
7. weak [X(A)] = X(weak [A]) 18. strong [X(A)] = X!(strong [A])
8. weak [G(A)] = G(weak [A]) 19. strong [G(A)] = f alse
9. weak [F (A)] = true 20. strong [F (A)] = F (strong [A])
10. weak [A U B] = weak [A] W 2 weak [B] 21. strong [A U B] = strong [A] U strong [B]
11. weak [A W B] = weak [A] W weak [B] 22. strong [A W B] = strong [A] U strong [B]

Table 6: Definitions of weak and strong LTL transformations

G(A)+ = {A U (a ∧ weak[G(A)]) | a ∈ A+ }
G(A)− = {A U a | a ∈ A− }
F (A)+ = {¬A U a | a ∈ A+ }
F (A)− = {¬A U (a ∧ weak[G(¬A)]) | a ∈ A− }
(A U B)+ = {(A ∧ ¬B) U ((a ∧ ¬B) ∧ weak[A U B]) | a ∈ A+ } ∪
{(A ∧ ¬B) U b | b ∈ B + }
(A U B) = {(A ∧ ¬B) U ((a ∧ ¬B) | a ∈ A− } ∪
−

{(A ∧ ¬B) U (b ∧ (¬weak[A U B])) | b ∈ B − }

+
X(A) = {X(a) | a ∈ A+ }
X(A)− = {X(a) | a ∈ A− }

Table 7: Weakened UFC LTL Formulas describing Accepting Prefixes

Original Fomulation: model checker to ﬁnd a way of generating a path to satisfy-

{ ((a ∨ b) ∧ ¬c) U (a ∧ ¬c ∧ ((a ∨ b) U c)), ing one of these formulas by asserting that there is no such
((a ∨ b) ∧ ¬c) U (b ∧ ¬c ∧ ((a ∨ b) U c)), path (i.e., negating the formula). We call such a formula a
((a ∨ b) ∧ ¬c) U c } trap formula or trap property [11]. The model checker will
now search for a counterexample demonstrating that this
Weakened Formulation: trap property is, in fact, satisfiable; such a counterexample
{ ((a ∨ b) ∧ ¬c) U (a ∧ ¬c ∧ ((a ∨ b) W c)), constitutes a test case that will show the UFC obligation
((a ∨ b) ∧ ¬c) U (b ∧ ¬c ∧ ((a ∨ b) W c)), of interest over the model. By repeating this process for
((a ∨ b) ∧ ¬c) U c } all formulas within the set derived from a property, we can
derive UFC coverage of the property over the model. By
Table 5: UFC obligations for ((a ∨ b) U c) performing this process on all requirements properties of in-
terest, we can derive a test suite that generates the UFC set
of requirements. We illustrate this process in the case study
3.2 Automatically Generating Requirements- in the following section.
Based Tests
Several research efforts have developed techniques for au-
tomatic generation of tests from formal models using model
4. CASE STUDY
checkers as test case generation tools [22, 23, 20, 11]. Model To assess the feasibility of auto-generating tests from re-
checkers build a finite state transition system and exhaus- quirements on a moderately-sized example, we generated a
tively explore the reachable state space searching for vio- set of requirements-based tests from the FGS case example
lations of the properties under investigation [9]. Should a described in Section 2. The test cases were generated in
property violation be detected, the model checker will pro- three ways. First, we simply negated the formal require-
duce a counterexample illustrating how this violation can ments (LTL properties) and provided them as trap proper-
take place. In short, a counterexample is a sequence of in- ties to the model checker; the resulting test suite provides
puts that will take the finite state model from its initial state one test case for each requirement. Second, we generated
to a state where the violation occurs. trap properties for what we call requirements antecedent
One way to use a model checker to find test cases is by coverage. Requirements antecedent coverage ensures that
formulating a test criterion as a verification condition for in requirements of the form G(A → B) the antecedent be-
the model checker. In the previous section, we described comes true at least once along a satisfying path so that the
UFC over paths and defined sets of LTL formulas that were requirement is not vacuously satisfied (antecedent is false
sufficient to show that a particular atomic condition affects throughout). Third, we generated trap properties for the
the outcome of the property. Given this set and a formal positive UFC (Unique First Cause) obligations discussed in
model of the software system, we can now challenge the Section 3.1 and used the model checker to generate UFC-
adequate tests over the requirements. In this initial experi-
2 ment we were interested in determining (1) the feasibility of
W is the Weak Until operator, defined in LTL as
p W q ≡ (p U q) ∨ G(p) generating such tests with a model checker, (2) the number

30
of test cases needed to provide requirements UFC coverage built in a previous project [15]. This environment uses the
for a substantial and realistic example, and (3) what cover- bounded model checker in NuSMV for test case generation.
age of the model these test sets would provide. We automatically translate the FGS model to the input lan-
To provide realistic results, we conducted the case study guage of NuSMV, generate all needed trap properties, and
using the requirements and model of the close to produc- transform the NuSMV counterexamples to input scripts for
tion model of a flight guidance system we introduced ear- the Nimbus RSML-e simulator so that the tests can be run
lier in the paper. This example consists of 293 informal on the model under investigation. Previously, we had en-
requirements formalized as LTL properties as well as a for- hanced the Nimbus framework with a tool to measure dif-
mal model captured in our research notation RSML-e [30]. ferent kinds of coverage over RSML-e models [14]. There-
All the properties in the FGS system are safety properties, fore, we could run the test cases on the RSML-e models and
there are no liveness properties. measure the resulting coverage.
Coverage Measurement: We measured coverage over
4.1 Setup the FGS model in RSML-e . We ran all three test suites (re-
The case study followed 3 major steps: quirements coverage, requirements antecedent coverage, and
Trap Property Generation: We started with the 293 requirements UFC coverage) over the model and recorded
requirements of the FGS expressed formally as LTL proper- the model coverage obtained by each. We measured
ties. We generated three sets of trap properties from these State coverage, Transition coverage, Decision coverage, and
formal LTL properties. MC/DC coverage.
First, we generated tests to provide requirements cover- • State Coverage: (Often referred to as variable do-
age; one test case per requirement illustrating one way in main coverage.) Requires that the test set has test
which this requirement is met. We obtained these test cases cases that enable each control variable (Boolean or
by simply negating each requirement captured as an LTL enumerated variable) defined in the model to take on
property and challenged the model checker to find a test all possible values in its domain at least once.
case. • Transition Coverage: Analogous to the notion of
Second, we generated tests to provide requirements an- branch coverage in code and requires that the test set
tecedent coverage. Consider the requirement has test cases that exercise every transition definition
in the model at least once.
G(A → B)
• Decision Coverage: Each decision occurring in the
Informally, it is always the case that when A holds B will model evaluates to true at some point in some test case
hold. A test case providing requirements antecedent cover- and evaluates to false at some point in some other test
age over such a requirement will ensure that the antecedent case.
A becomes true at least once along the satisfying path. That • Modified Condition and Decision Coverage
is, the test case would satisfy the obligation (MC/DC): Every condition within the decision has
G(A → B) ∧ F (A) taken on all possible outcomes at least once and every
condition has been shown to independently affect the
We implemented a transformation pass over the LTL spec- decision’s outcome.
ifications so that for requirements of the form G(A → B)
we would generate trap properties requiring A to hold some- 4.2 Results and Analysis
where along the path. We used our tools to automatically generate and run three
Third, we generated tests to provide requirements UFC test suites for the FGS; one suite for requirements coverage,
coverage over the syntax (or structure) of the required LTL one for requirements antecedent coverage and another one
properties. The rules for performing UFC over temporal op- for requirements UFC coverage. The results from our ex-
erators were explained in Section 3.1. Using these rules, we periment are summarized in Tables 8 and 9.
implemented a transformation pass over the LTL specifica-
tions to generate trap properties for both the neutral and Requirements Antecedent UFC
weakened UFC notions discussed in section 3.1 (we used coverage coverage coverage
the same implementation to generate tests for requirements Trap Properties 293 293 887
coverage, and requirements antecedent coverage mentioned Test Cases 293 293 715
above). However, both neutral and weakened notions of Time Expended 3 min 14 min 35 min
UFC result in the same test suite for this case example,
since the LTL property set for the FGS system has no ‘fu- Table 8: Summary of the test case generation re-
ture’ and ‘until’ temporal operators. We only generated the sults.
positive UFC set over the temporal properties since each of Table 8 shows the number of test cases in each test suite
the properties is known to hold of the model. and the time it took to generate them. It is evident from the
Although the properties were already known to hold of table that the UFC coverage test suite is three times larger
the model, generating tests is still a useful excercise. For a than the requirements coverage and requirements antecedent
developer, it provides a rich source of requirements-derived coverage test suites and can therefore be expected to provide
tests that can be used to test the behavior of the object code. better coverage of the model than the other two test suites.
For the purposes of this paper, it provides a straightforward Also, the time expended in generating the UFC coverage
way to test the fault finding capability and completeness of test suite was significantly higher than the time necessary
our metrics. to generate the other two test suites.
Test suite Generation: To generate the test cases We observe that our algorithm generating UFC over the
we leveraged a test case generation environment that was syntax of the requirements generated 887 trap properties.

31
Nevertheless, only 715 of them generated counterexamples3 . and the resultant poor model coverage is not at all surpris-
For the remaining 172 properties, the UFC trap property is ing.
valid, which means that the condition that it is designed to The requirements antecedent coverage is a stronger metric
test does not uniquely affect the outcome of the property. than the requirements coverage measure. It gives high state,
In each of these cases the original formula was vacuous [3], transition and decision coverage over the model. However,
that is, the atomic condition was not required to prove the the MC/DC coverage generated over the model is low. As
original formula. We discuss the issue of vacuity checking mentioned earlier, many of the requirements in the FGS
further in Section 5. system are of the form
Although it was possible to explain many of the vacu-
G(a → Xb)
ous conditions through implementation choices that satis-
fied stronger claims than the original properties required, Requirements antecedent coverage will ensure that the test
the number of vacuous conditions was startling and pointed cases found for such properties will exercise the antecedent
out several previously unknown weaknesses in our original a (i.e., make a true). Therefore, the requirement is not
property set. Rather than correcting the incorrectly vacu- trivially satisfied and we get longer test cases and, thus,
ous formulas in our property set before proceeding with our better model coverage than the requirements coverage test
experiment, we decided to use the property set “as-is” as a suite.
representative set of requirements that might be provided On the other hand, the test suite generated for UFC over
before a model is constructed. If test cases were manually the syntax of the properties provides high state, transition,
constructed from this set of requirements, we postulate that and decision coverage. Nevertheless, the decision coverage
many of these weaknesses would be found when trying to provided by the UFC test suite is in this experiment lower
construct test cases for the vacuous conditions. than that provided by the requirements antecedent coverage
For all three coverage metrics mentioned, we did not min- test suite. When we looked more closely at both test suites
imize the size of the test suites generated. In previous work we found that for variables specified in the property (we call
we found that test suite reduction while maintaining desired these variables of interest), both test suites had the same
coverage can adversely affect fault finding [14]. However, the values. In other words, for the variables of interest, the
work in [14] was based on test suites generated using model UFC test suite is a superset of the requirements antecedent
coverage criteria. We plan to explore the effect of test suite coverage test suite. However, for variables not mentioned
reduction techniques on test suites generated using require- in the properties (we call these free variables), the model
ments coverage criteria in our future work. checker has the freedom to choose any suitable value for that
Model variable. We found that the values for these free variables
Coverage Requirements Antecedent UFC differed between the two test suites. We believe that this is
Metric coverage coverage coverage the reason for the antecedent coverage test suite generating
State 37.19% 98.78% 99.12% a higher decision coverage over the model than the UFC
Transition 31.97% 89.53% 99.42% test suite; the model checking algorithm in NuSMV simply
Decision 46.42% 85.75% 83.02% picked values for the free variables that happened to give
MC/DC 0.32% 23.87% 53.53% the requirements antecedent test suite better coverage. If we
could control the selection of the free variables the UFC test
Table 9: Summary of the model coverage obtained suite would yield the same or higher decision coverage than
by running the requirements based tests. the requirements antecedent coverage test suite. Clearly, the
In Table 9 we show the coverage of the formal model test case generation method plays an important role in the
achieved when running the test cases providing require- types of test cases generated. We plan to explore the effect
ments coverage, requirements antecedent coverage and re- of different test case generation methods in our future work.
quirements UFC coverage respectively. We measured four The UFC test suite generated low MC/DC coverage over
different model coverage criteria as mentioned in Section 4.1. the model, although not as low as the other two test suites.
The results in Table 9 show that the test suite generated After some thought, it became clear that this is due in part
to provide requirements coverage (one test case per require- to the structure of the requirements defined for the FGS.
ment) gives very low state, transition, and decision coverage, Consider the requirement
and almost no MC/DC coverage. This is in part due to the
structure of the properties. Most of the requirements in the “When the FGS is in independent mode, it shall be active”
FGS system are of the form
This was formalized as a property as follows:
G(a → Xb)
G(m Independent M ode Condition.result →
The test cases found for such properties are generally those X(Is T his Side Active = 1))
in which the model goes from the initial state to a state
where a is false, thus trivially satisfying the requirement. Informally, it is always the case (G) that if the condition
Such test cases exercise a very small portion of the model for being in independent mode is true, in the next state the
3
We initially used the NuSMV bounded model checker with FGS will always be active. Note here that the condition
depth 5, that is, we only looked for test cases with a length determining if the FGS is to be in independent mode is
of 5 steps or shorter. If the bounded model checker did not abstracted to a macro (Boolean function) returning a result
find such a test case we provided the trap property to the (the .result on the left hand side of the implication). Many
symbolic model checker in NuSMV and found that in all requirements for the FGS were of that general structure.
cases it verified the property as being true; that is, there
was no test case for this particular UFC obligation. G(M acro name.result → X b)

32
Therefore, when we perform UFC over this property struc- not necessary to prove whether or not f is true. Formally,
ture, we do not perform UFC over the—potentially very a formula is vacuous if we can replace φ by any arbitrary
complex—condition making up the definition of the macro formula ψ in f without affecting the validity of f :
since this condition has been abstracted away in the prop-
erty definition. M f ≡ M f [φ ← ψ]
The macro Independent M ode Condition is defined in
Beer et al. [3] shows that it is possible to detect whether
RSML-e as:
a formula contains vacuity by checking whether each of its
MACRO Independent_Mode_Condition(): atomic subformulas can be replaced by ‘true’ or ‘false’ with-
TABLE out affecting the validity of the original formula (the choice
Is_LAPPR_Active : T *; depends on the structure of the formula and whether it is
Is_VAPPR_Active : T *; satisfied or not). To place it in our terms, this check deter-
Is_Offside_LAPPR_Active : T *; mines whether each atomic condition independently affects
Is_Offside_VAPPR_Active : T *; the formula in question. This approach can be used to gen-
Is_VGA_Active : * T; erate witness counterexamples for each atomic condition,
Is_Offside_VGA_Active : * T; similar to the trap properties for UFC that are described
END TABLE in Section 4. Nevertheless, the goal of this work is quite
END MACRO different than ours. The purpose of performing vacuity de-
tection on a formula over a model is to see whether or not
Since the structure of Independent M ode Condition is not
a valid formula can be replaced by a stronger valid formula.
captured in the required property, the test cases generated
This stronger formula may indicate problems within either
to cover the property will not be required to exercise the
the formula or the model. Our work is concerned with ad-
structure of the macro and we will most likely only cover
equately testing requirements, potentially in the absence of
one of the MC/DC cases needed to adequately cover the
a model. The complete vacuity check defined in [3] is one
macro.
possible metric for assessing the adequacy of a test set. It
Note that this problem is not related to the method we
is simpler and more rigorous metric than our UFC metric
have presented in this paper; rather, the problem lies with
when used for test generation. In future work, we plan to
the original definition of the properties. Properties should
reformulate the metric in [3] to support “partial” weakening
not be stated using internal variables, functions, or macros
(as described in Section 3.1.3) and investigate its effective-
of the model under investigation; to avoid a level of circu-
ness.
lar reasoning (using concepts defined in the model to state
Tan et al. [28] use the vacuity check presented in [3] to
properties of the model) the properties should be defined
define a property coverage metric over LTL formulas. They
completely in terms of the input variables to the model. If a
present a model-checking assisted approach that generates
property must be stated using an internal variable (or func-
a test suite based on the property coverage criteria that is
tion) then additional requirements (properties) are required
finite in size and in length. Their main motivation is to en-
to define the behavior of the internal variable in terms of
able the testing of linear temporal properties on the imple-
inputs to the system. In this example, a collection of addi-
mentation by generating a black-box or white-box test suite
tional requirements defining the proper values of all macro
that satisfies the defined property coverage metric. The goal
definitions should be captured. These additional require-
of our work is to define a structural coverage metric based
ments would necessitate the generation of more test cases
on requirements that allows us to measure the adequacy of
to achieve requirements UFC coverage and we would pre-
black-box test suites. Also, Tan et al. [28] do not have any
sumably get significantly better coverage of the model.
notion of test weakening and their notion of acceptable black
To get a better idea of how many additional test cases
box tests, while rigorous, is not very practical. The reason is
UFC coverage would necessitate when we add requirements
that to test a finite prefix of a lasso shaped test, they repeat
defining macros, we considered the sample requirement
the loop part of the lasso n times where n is the number of
property mentioned earlier:
system states. Many of the systems used in practice have
very large n, thus making their methodology of generating
G(m Independent M ode Condition.result →
acceptable black-box tests highly impractical. Also, they do
X(Is T his Side Active = 1))
not address the efficacy of their proposed metric.
In our case study we examined how well coverage of
We constructed a property defining the
requirements mapped to coverage of an implementation
Independent M ode Condition macro. When we performed
model. This is similar to recent research assessing the com-
UFC over this additional macro defining requirement we
pleteness of LTL properties over models. In [8], Chockler
got 13 additional test cases. This result shows that adding
et al. propose coverage metrics for formal verification based
requirements that define all the macros would make a
on metrics used in hardware simulation. Mutations, captur-
substantial difference to the test suite size and presumably
ing the different metrics of coverage, are applied to a given
the model coverage.
design and the resultant mutant designs are examined with
respect to a given specification. Two coverage checks were
5. RELATED WORK performed on the mutant design: falsity coverage (does the
The work in this paper is closely related to work assessing mutant still satisfy the specification?), and vacuity coverage
the completeness and correctness of formulae in temporal (if the mutant design still satisfies the specification, does it
logics. The most similar work involves vacuity checking of satisfy it vacuously?). Symbolic algorithms to compute the
temporal logic formulas [3, 16, 21]. Intuitively, a model different types of coverage are proposed. In [7, 6], Chockler
M vacuously satisfies property f if a subformula φ of f is et al. propose additional metrics to determine whether all

33
parts of a model are covered by requirements. Our goal is This report only provides the start of the rigorous explo-
to create a coverage metric that when provided a robust set ration of metrics for requirements-based testing; we have
of formal requirements and a test suite satisfying the met- merely defined the notion and explored the feasibility of
ric will yield a high level of coverage of an implementation the approach. There are several topics that require further
using standard coverage metrics. Chockler’s work provides study:
a more direct (and potentially accurate) assessment of the Requirements formalization: Since formalizing high-
adequacy of the requirements. The process of building the level requirements is a rather new concept not generally
set of mutant specifications and re-checking the properties is practiced, there is little experience with how to best cap-
very expensive, however, and may not be feasible in practice. ture the informal requirements as formal properties. Find-
ing a formalism and notation that is acceptable to practicing
6. DISCUSSION developers, requirements engineers, and domain experts is
In this paper, we explore the idea of requirements coverage necessary. In our work we have used CTL and LTL, but we
and define three potential metrics that could be used to are convinced that there are notations better suited to the
assess requirements coverage. To our knowledge, the notion task at hand.
of requirements coverage as a test adequacy criterion has not Requirements coverage criteria: To our knowledge,
been previously addressed in any systematic way. There are there has been little other work on defining coverage cri-
several potential benefits of defining requirements adequacy teria for high-level software requirements. Therefore, we do
criteria, including: not know what coverage criteria will be useful in practice.
• a direct measure of how well a black-box test suite We must find coverage criteria that (1) help us assess test
addresses a set of requirements, suites as to their effectiveness in finding problems in imple-
mentations derived from the requirements and (2) do not
• an implementation independent assessment of the ad- require test suites of unreasonable size.
equacy of a suite of black-box tests,
Requirements versus model coverage: We must ex-
• a means for measuring the adequacy of requirements plore the relationship between requirements-based struc-
on a given implementation, tural coverage and model or code-based structural cover-
age. Given a “good” set of requirements properties and a
• a formal framework that, given a model, allows for
test suite that provides a high level of structural coverage
autogeneration of tests that are immediately traceable
of the requirements, is it possible to achieve a high level of
to requirements
structural coverage of the formal model and of the generated
Our hypothesis is that given a complete set of require- code? That is, does structural coverage at the requirements
ments and a rigorous testing metric, we should achieve a level translate into structural coverage at the code level?
high level of coverage of an implementation of the require- Test case generation method: We plan to evaluate the
ments. If this hypothesis is true, then a requirements-based effect that different test case generation methods have on
test suite can be used to help determine the completeness the fault finding capability and model coverage achieved by
of a set of requirements with respect to an implementation, test suites that provide a high level of structural coverage of
and a test suite which yields high requirements coverage and the requirements. We plan to investigate a variety of model
low model coverage illustrates one of three problems: checkers and their strategies in this regard.
1. The model is incorrect. The model allows behaviors In sum, we believe that the notion of coverage metrics
not specified by the requirements. Hence a test suite for requirements-based testing holds promise and we look
that provides a high level of requirements coverage will forward to exploring each of these topics in future investi-
not cover these incorrect behaviors, thus resulting in gations.
poor model coverage.
2. There are missing requirements. Here, the model un- 7. REFERENCES
der investigation may be correct and more restrictive [1] P. E. Ammann and P. E. Black. A specification-based
coverage metric to evaluate test sets. In Proceedings of the
than the behavior defined in the original requirement; Fourth IEEE International Symposium on High-Assurance
the original requirements are simply incomplete and Systems Engineering. IEEE Computer Society, Nov. 1999.
allow behaviors that should not be there. Hence we [2] R. Armoni, D. Bustan, O. Kupferman, and M. Y. Vardi.
need additional requirements to restrict these behav- Aborts vs. resets in linear temporal logic. In TACAS, pages
iors. These additional requirements necessitate the 65–80, November 2003.
creation of more test cases to achieve requirements cov- [3] I. Beer, S. Ben-David, C. Eisner, and Y. Rodeh. Efficient
erage and will, presumably, lead to better coverage of detection of vacuity in ACTL formulas. In Formal Methods
in System Design, pages 141–162, 2001.
the model.
[4] B. Bezier. Software Testing Techniques, 2nd Edition. Van
3. The criteria chosen for requirements coverage is too Nostrand Reinhold, New York, 1990.
weak. [5] J. J. Chilenski and S. P. Miller. Applicability of modified
condition/decision coverage to software testing. Software
Given a robust and complete set of requirements, we still Engineering Journal, pages 193–200, September 1994.
do not necessarily anticipate 100% MC/DC coverage of a [6] H. Chockler, O. Kupferman, R. P. Kurshan, and M. Y.
Vardi. A practical approach to coverage in model checking.
model given 100% UFC coverage of the requirements. There In Proceedings of the International Conference on
are usually several designs that may satisfy a “good” set of Computer Aided Verification (CAV01), Lecture Notes in
requirements and these designs will introduce details that Computer Science 2102, pages 66–78. Springer-Verlag, July
may not be covered by requirements-based tests. 2001.

34
[7] H. Chockler, O. Kupferman, and M. Y. Vardi. Coverage [19] S. P. Miller, M. P. Heimdahl, and A. Tribble. Proving the
metrics for temporal logic model checking. In Proceedings shalls. In Proceedings of FM 2003: the 12th International
of the International Conference on Tools and Algorithms FME Symposium, September 2003.
for the Construction and Analysis of Systems, Lecture [20] A. J. Offutt, Y. Xiong, and S. Liu. Criteria for generating
Notes in Computer Science 2031, pages 528–542. specification-based tests. In Proceedings of the Fifth IEEE
Springer-Verlag, April 2001. International Conference on Engineering of Complex
[8] H. Chockler, O. Kupferman, and M. Y. Vardi. Coverage Computer Systems (ICECCS ’99), October 1999.
metrics for formal verification. In 12th Advanced Research [21] M. Purandare and F. Somenzi. Vacuum cleaning CTL
Working Conference on Correct Hardware Design and formulae. In Proceedings of the 14th Conference on
Verification Methods, volume 2860 of Lecture Notes in Computer Aided Design, pages 485–499. Springer-Verlag,
Computer Science, pages 111–125. Springer-Verlag, 2002.
October 2003. [22] S. Rayadurgam. Automatic Test-case Generation from
[9] E. M. Clarke, O. Grumberg, and D. Peled. Model Formal Models of Software. PhD thesis, University of
Checking. MIT Press, 1999. Minnesota, November 2003.
[10] C. Eisner, D. Fisman, J. Havlicek, Y. Lustig, A. McIsaac, [23] S. Rayadurgam and M. P. Heimdahl. Coverage based
and D. V. Campenhout. Reasoning with temporal logic on test-case generation using model checkers. In Proceedings
truncated paths. In Proceedings of Computer Aided of the 8th Annual IEEE International Conference and
Verification (CAV), pages 27–39, 2003. Workshop on the Engineering of Computer Based Systems
[11] A. Gargantini and C. Heitmeyer. Using model checking to (ECBS 2001), pages 83–91. IEEE Computer Society, April
generate tests from requirements specifications. Software 2001.
Engineering Notes, 24(6):146–162, November 1999. [24] S. Rayadurgam and M. P. Heimdahl. Generating MC/DC
[12] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The adequate test sequences through model checking. In
synchronous dataflow programming language Lustre. Proceedings of the 28th Annual IEEE/NASA Software
Proceedings of the IEEE, 79(9):1305–1320, September 1991. Engineering Workshop – SEW-03, Greenbelt, Maryland,
[13] K. Hayhurst, D. Veerhusen, and L. Rierson. A practical December 2003.
tutorial on modified condition/decision coverage. Technical [25] Reactive Systems Inc. Reactis product description.
Report TM-2001-210876, NASA, 2001. http://www.reactive-systems.com/index.msp.
[14] M. P. Heimdahl and G. Devaraj. Test-suite reduction for [26] D. J. Richardson, S. L. Aha, and T. O’Malley.
model based tests: Effects on test quality and implications Specification-based test oracles for reactive systems. In
for testing. In Proceedings of the 19th IEEE International Proceedings of the 14th International Conference on
Conference on Automated Software Engineering (ASE), Software Engineering, pages 105–118. Springer, May 1992.
Linz, Austria, September 2004. [27] RTCA. Software Considerations In Airborne Systems and
[15] M. P. Heimdahl, S. Rayadurgam, and W. Visser. Equipment Certification. RTCA, 1992.
Specification centered testing. In Second International [28] L. Tan, O. Sokolsky, and I. Lee. Specification-based testing
Workshop on Analysis, Testing and Verification, May with linear temporal logic. In IEEE Int. Conf. on
2001. Information Reuse and Integration (IEEE IRI-2004),
[16] O. Kupferman and M. Y. Vardi. Vacuity detection in November 2004.
temporal model checking. Journal on Software Tools for [29] E. Technologies. Scade suite product description.
Technology Transfer, 4(2), February 2003. http://www.esterel-technologies.com/v2/ scadeSuiteFor-
[17] Z. Manna and A. Pnueli. Temporal verification of reactive SafetyCriticalSoftwareDevelopment/index.html,
systems: Safety. Technical report, Springer-Verlag, New 2004.
York, 1995. [30] M. W. Whalen. A formal semantics for RSML−e . Master’s
[18] S. Miller, A. Tribble, T. Carlson, and E. J. Danielson. thesis, University of Minnesota, May 2000.
Flight guidance system requirements specification.
Technical Report CR-2003-212426, NASA, June 2003.