Metrics and Laws of Software Evolution - The Nineties View: Fig. 1 OS/360 Growth Trend by RSN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Metrics and Laws of Software Evolution - The Nineties View

M M Lehman D E Perry
J F Ramil Bell Laboratories, Murray Hill, NJ 07974
P D Wernick +1 908 582 2529
Department of Computing dep@research.bell-labs.com
Imperial College of Science, Technology and
Medicine W M Turski
London SW7 2BZ Institute of Informatics
tel: +44 (0)171 594 8214 Warsaw University
fax: +44 (0)171 594 8215 Warsaw 02-097
e-mail: {mml,jcf1,pdw1}@doc.ic.ac.uk +48 22 658 3522
URL:http://www-dse.doc.ic.ac.uk/~mml/feast1/ wmt@mimuw.edu.pl

Abstract trends. An example of the study output is provided by


The process of E-type software development and evolution figure 1.
has proven most difficult to improve, possibly due to the
fact that the process is a multi-input, multi-output system 8000 Size in Modules OS/360
involving feedback at many levels. This observation, first
recorded in the early 70s during an extended study of 7000 Growth Trend
OS/360 evolution, was recently captured in a FEAST 6000
hypothesis; a hypothesis being studied in on-going two-
year project, FEAST/1. Preliminary conclusions based on 5000
a study of a financial transaction system, FW, are outlined 4000
and compared with those reached during the earlier OS/360
study. The new analysis supports, or better does not 3000
contradict, the laws of software evolution, suggesting that 2000
the 1970s approach to metric analysis of software
1000
evolution is still relevant today. It is hoped that FEAST/1
will provide a foundation for mastering the feedback 0 RSN
aspects of the software evolution process, opening up new 0 5 10 15 20 25
paths for process modelling and improvement.
Fig. 1 OS/360 growth trend by rsn
Keywords: Software:- process, evolution, process metrics,
dynamics and improvement; Lehman's laws
Up to release rsn21, the OS/360 growth trend illustrated
by figure 1 may be interpreted as a small ripple
1 Introduction superimposed on otherwise smooth growth. This pattern
is reminiscent of traces generated by self-regulating and
A 1968 study of the IBM software programming self-stabilising systems with both positive and negative
process1 [leh69,85] led, inter alia, to metric based studies feedback [bel72, leh78]. The behaviour thereafter may be
of OS/3602 [bel72,leh74,85] and other systems interpreted as a sign of instability induced by excessive
[leh80b,kit82]. Analysis of data relating ultimately to positive feedback; rapid functional evolution which led to
some 26 of OS/360 releases and sub-releases, identified a fission process; the transition from OS/360 to VS1 and
and ordered by their release sequence number rsn [cox66], VS2. Alternatively it may be interpreted as chaos-like
yielded insights into various aspects of its evolutionary behaviour. Either interpretation suggests that further
prediction based on earlier behaviour is uncertain.
1
Unless otherwise stated, all references in this paper to process refer It was these observations that first suggested that
to both ab initio development and to subsequent system maintenance,
that is, to the process of all aspects of system evolution [leh85]. software evolution processes are and must be treated as
2 being feedback driven and constrained systems [leh74,78].
References in this paper to the IBM OS/360 system, refer to both that
system and its successor OS/370. Subsequently, preliminary examination of data on several
non-IBM systems repeated many of the earlier The laws could, however, only be applied with any degree
observations [leh77,bel78]. of confidence to the domains to which the data related.
The feedback theme was also applied by Abdel-Hamid Generalisation depended on obtaining confirming evidence
and Madnick [abd91] in their work on the use of system from other systems and organisations. If achieved, such
dynamics in modelling software project management generalisation would provide a theoretical and practical
issues. This approach and related modelling techniques base and framework for the evolution of E-type programs,
originated in the seminal work of Forrester and his that is, software solving a problem or addressing an
colleagues at M.I.T [for61,70]. More recently it has been application in the real world [leh80b].
applied to the study of aspects of software process
improvement, eg. [abd91,mcg93,wae94,mad96]. 2 Process improvement
The 1970s study, exemplified by figure 1, revealed
regularity unexpected for variables the indirect In recent years many businesses, and the software
consequences of human decision but not directly or industry in particular, have developed a strong interest in
intentionally controlled. It could and was, therefore, and commitment to disciplined process improvement.
captured in a series of statements abstracting the observed More and more business processes are, however, dependent
behaviour. Such abstraction involves a domain larger than on software generated information. They are driven and
the realm of software technology as normally understood. controlled by computers and software, probably including
In fact, it includes software technology as a sub domain. E-type legacy systems that may have been operational for
From the point of view of software engineering, such many years. Now specification and design of such systems
statements must, therefore, be accepted as an external requires assumptions about the intended application and its
regulating and constraining force. To overcome them operational domain. These, in turn, will be reflected in the
requires expertise in organisational dynamics, software. Subsequently, installation and operation of the
management, sociology etc., not just software technology. system together with exogenous change will invalidate
Thus they were subsequently referred to as laws of some of the embedded assumptions [leh89]. The system
software evolution [leh74,78,80,85]. The laws, as must, therefore, be continually updated to maintain their
proposed then [leh74] and subsequently amended validity and adapt to changed circumstances. Business and
[leh78,80], are summarised in table 1 with column 1 software process improvement are strongly linked and
indicating the year when each was first published. interdependent [leh97].

No. Brief Name Law


I Continuing Change E-type systems must be continually adapted else they become progressively less
1974 satisfactory.
II Increasing Complexity As an E-type system evolves its complexity increases unless work is done to
1974 maintain or reduce it.
III Self Regulation E-type system evolution process is self regulating with distribution of product
1974 and process measures close to normal.
IV Conservation of Organisational The average effective global activity rate in an evolving E-type system is
1980 Stability (invariant work rate) invariant over product lifetime.
V Conservation of Familiarity As an E-type system evolves all associated with it, developers, sales personnel,
1980 users, for example, must maintain mastery of its content and behaviour [leh80a]
to achieve satisfactory evolution. Excessive growth diminishes that mastery.
Hence the average incremental growth remains invariant as the system evolves.
VI Continuing Growth The functional content of E-type systems must be continually increased to
1980 maintain user satisfaction over their lifetime.
VII Declining Quality The quality of E-type systems will appear to be declining unless they are
1996 rigorously maintained and adapted to operational environment changes.
VIII Feedback System E-type evolution processes constitute multi-level, multi-loop, multi-agent
1996 (first stated 1974, feedback systems and must be treated as such to achieve significant improvement
formalised as law 1996) over any reasonable base.

Table 1 Laws of software evolution


The present paper focuses on software process The EPSRC4 proposal that resulted from these
improvement. The fact is that the software industry has discussions was entitled FEAST/1 [leh96c], the "/1" in the
been seeking improvement of the software development title indicating that this two year, 3 person project was to
and maintenance process for many years [wil51,leh96]. be seen as a first step in a longer and more widespread
Academic and industrial effort has yielded incremental investigation. The proposal was approved in March 1996
improvement, through the introduction of new languages, and the resultant project commenced formal investigations
formalisation, improved methods, mechanised support in October 1996 in collaboration with ICL plc, Logica
(CASE), new programming paradigms and so on. plc, Matra-BAe Dynamics plc and two groups within the
Nevertheless, the industrial track record raises the question UK Ministry of Defence. The stated objectives [leh96c]
why, despite so many advances, the global software were to:
development process from conception to use is still so • provide objective evidence that feedback phenomena
often marred; why satisfactory functionality, performance and the consequent system dynamics have substantial
and quality is only achieved over a lengthy evolutionary impact in the software process
process, why software maintenance never ceases until a • demonstrate that the phenomena can be exploited in
system is scrapped, why software is still generally regarded both managing and improving industrial processes
as the weakest link in the development of computer-based • produce justification for a wider and more substantial
systems [leh94,96]. study based on the feedback perspective.
Explanations for individual failures can always be The two year investigation will seek to identify and
found. This paper summarises a more general approach characterise the feedback mechanisms active in the process,
arising from recognising that development and evolution their impact on process characteristics and methods for
processes for E-type systems are intrinsically feedback applying the understanding gained to improve the
systems [leh94]. The remainder of the paper reports respective processes. It expects to demonstrate (or
preliminary results from an investigation of this otherwise) the feedback behaviour of some widely different
hypothesis. systems. Three approaches are being employed:
• a black box approach is studing quantitative data from
3 FEAST (Feedback, Evolution And a number of industrial software processes to identify
S oftware Technology) and FEAST/1 patterns in the evolution of the respective systems.
The data will be analysed in a search for footprints of
dynamical behaviour and feedback control
Some years ago, the realisation that feedback3 in the • a white box approach aims at the construction and
software process could explain the difficulties encountered enactment of system dynamics models of individual
in achieving its global improvement, led to the processes. These will reflect feedback mechanisms,
formulation of a FEAST hypothesis [fea94,leh94]: their properties and their impact on the global process
As complex feedback systems, E-type software processes • a third approach, not included in the original proposal
evolve strong system dynamics and with it the global and not formally part of FEAST/1, is exploring the
stability characteristics of other such systems. Consequent use of multi-agent systems [mca95] to model the
stabilisation effects are likely to constrain efforts at selected processes and to evaluate proposed
process improvement. improvements.
Full investigation of the hypothesis and, if upheld, of
More recently the hypothesis was restated in the means for its exploitation, is not straightforward. As
following terms [leh96c]: previously discussed [leh96b], difficulties arise from
As for other complex feedback systems, the dynamics of several factors. The processes being investigated are likely
the real world software development and evolution to include tens, if not hundreds, of forward paths and
processes will posses a degree of autonomy and global feedback loops. A simulation approach such as the system
stability. dynamics technique referred to in section 1 is, therefore
more appropriate tool for the investigation than the
Both versions of the hypothesis include a number of analytical tools of control theory. Moreover, the
assertions [leh96a] but these are not further discussed here. processing and control mechanisms associated with these
The hypothesis and its implications were examined over loops involve people, individually and in groups as
a period of time by a FEAST core group, a subgroup of managers or implementors. All observe, interpret,
the present authors. Its deliberations [leh96b] led to three communicate, decide and act or refrain from acting on the
workshops held at Imperial College during 1994/5 basis of their overall perception, their instructions and,
[fea94,95]. The objectives were to expose the ideas to a consciously or otherwise, their inclinations, experience
wider group of people interested in the software process, to and biases. Much of the feedback control is unplanned or
seek the objective criticism of experts and, in general, to even unconscious. Some, at least, of the feedback
explore the hypothesis and its implications. mechanisms are, therefore, stochastic and non-
3
The term feedback may be interpreted in several different ways. This 4
theme has been discussed in [leh96b]. (UK) Engineering and Physical Sciences Research Council.
deterministic. Furthermore the system being modelled should be available shortly.
includes elements that contain implicit models of The first system evolution data to be made available to
themselves. This is of course mathematically intractable the FEAST/1 project was on the Logica plc Fastwire
posing a fundamental ultimate obstacle to the (FW) financial transaction system. This 8 years old
investigation [göd31, leh85]. Note also that convincing system is now installed on some one hundred sites. The
support for the hypothesis requires that the analysis and its data set received covers the most recent 5 years of its
associated predictive models, must necessarily be evolution. Since then there have been several main
quantitative. The number of data points available for each releases and many more sub-releases. The data set as
of the classes of data is, however, likely to be relatively received from Logica related to some 100 releases (as
small. Statistical analysis and the determination of defined by them) with each entry including three data
significance is, therefore, not straight forward. Work in the items; release ID, size in modules and number of modules
application of control theory to economic modelling changed. Release dates were also available for most of the
[bec94] and, more recently, in the application of system data points. Many of these releases were, however, of the
dynamics to aspects of the software process same size as their predecessor. Clarification revealed that
[abd91,wae94,mad96] suggest, however, that progress is these were fix releases that were very frequently only
possible. Note also that software engineers and others in transmitted to those (limited) number of customers
the organisations in which and with which they work do adversely affected by a fault in an earlier release. A subset
not, in general, have the understanding, knowledge, skills of the data was therefore selected. As more familiarity with
or experience required for this analysis as part of their the system history has been attained the criteria, and
background. The long term study requires, therefore, a therefore the subset selected, have had to be changed to
collaborative, extended, multidisciplinary approach to yield the set shown in table 2. The analysis and plots
achieve exploitable results. presented below may, therefore, differ somewhat from
The remainder of this paper focuses on the initial those included in earlier publications [tur96], [leh96a,97].
results of the first FEAST/1 black box study. Much still The trends and patterns they display have, however, not
remains to be done and the most significant contribution changed significantly. Details of the selection and
of the present project may well be to arouse wider refinement criteria applied to the data have been
international interest and so trigger the necessary documented in an internal report.
collaborative investigations.
RSN Size in Release RSN Size in Release
4 A first case study: the Logica FW Modules ID Modules ID
system 1 977 1.0 12 2087 5.0A
2 1344 2.0A 13 2091 5.0B
4.1 FW data 3 1390 2.0B 14 2095 5.0C
4 1492 2.0C 15 2101 5.0D
After extensive discussion with the collaborators and 5 1581 2.0D 16 2151 5.0E
others, selection criteria and candidate systems have been 6 1595 2.0E 17 2167 5.0F
identified. The preference was for medium to large
7 1800 3.0A 18 2312 6.0A
systems, how ever defined. It was also considered desirable
to concentrate on systems that were being used in a 8 1832 3.0B 19 2315 6.0B
number of locations so that the effect of users' feedback 9 1897 4.0A 20 2696 7.0A
which, it is believed, is likely to have significant impact, 10 1897 4.0B 21 2699 7.0B
could be identified and assessed. Other criteria included the 11 1902 4.0C
availability of historical data on system evolution to
permit initial black box analysis to detect the presence or
absence of feedback-like behaviour. Prior experience Table 2 The FW data set
suggests that data on some ten releases is necessary for the
identification of behaviour patterns. For the white box To protect their identity, the IDs of the releases listed in
studies seeking to model internal process structure and to table 2 have been replaced by a sequence of identifiers that
identify active feedback controls and their impact, ongoing replace those assigned by Logica. In addition, and as was
projects were considered essential. Projects having, in done in the OS/360 study [leh80b], consecutive integers
addition, a sufficiently long history to provide meaningful have been assigned to the releases comprising the
black box data would be particularly useful since this evolution sequence to be analysed. These provide a
would provide opportunities for linking characteristics pseudo-time measure designated the rsn, a sequence
inferred from the black and white box studies. At the time number in the sense of Cox and Lewis [cox66]. Basing the
of writing collaborator products and processes satisfying analysis on this measure is appropriate because only at the
these criteria have been identified, information on process instant of release of an E-type software system are its
structure and content is being gathered and metric data properties, as determined by the then established software
text, uniquely defined. By definition, an E-type system An additional release type has been identified in the FW
operates in a domain always liable to change at a rate that process. This type, termed ad-hoc, is initially aimed at the
is accelerated by development, installation and operation of satisfaction of the needs of a specific client. Such releases
the system. Thus the software too, that is the code and/or are excluded from the results presented below. However,
its documentation, must be repeatedly updated and adapted5 unless providing some temporary facility later to be
to remain a faithful model of the application in its removed, the enhancements included in such releases are
operational domain. At the time of release the text is, by sooner or later integrated into the main stream to maintain
edict, fully defined. At all other times it is likely to be in smooth, uniform evolution over all installed systems and
a state of flux [leh85]. simplify overall FW configuration management.
The releases included in the analysis whose results are Moreover, these releases absorb project resources and,
presented below may be categorised into three classes: therefore, impact other concurrent activities. Thus they
• Major mainstream releases. These are intended to be need, ultimately, to be included in the analysis.
adopted by the majority of user organisations. They
are often required to achieve standardisation or for legal 4.2 System growth
reasons.
• Minor mainstream releases. These provide minor With the cost of storage declining at all levels, system
improvements or enhancements. Such releases are size is, in itself, not of major concern. It may, therefore,
included in the analysis presented below only where, be seen as a independent and composite monitor of system
in addition to other criteria, at least one module has evolution which, within limits, is neither planned nor
been added or deleted with respect to its evolutionary managed. It is determined by other factors. Some of these
predecessor. will be managed, others "just happen". Size determinants
• Error correction releases. These neither add nor include system design, programmer style and experience,
enhance functionality. These also have been included development timetables and constraints, intensity of the
in the analysis if they involve system growth by, at desire to achieve compactness or clarity6 . The great
least, one module. majority of reported software metrics work has tended to
One consequence of ordering releases by rsn in the use locs (lines of code) as the measure of system size. As
presence of the various types of releases is that a situation in the case of the original OS/360 study [leh80b,85] the
may arise in which the ordering adopted differs from the FW analysis reported here has used module count for that
date ordering. For example, work on a new mainstream purpose. In the absence of a better measure, module count
version 3.0A may have been proceeding concurrently with also serves as an initial estimator of system functionality
minor enhancement or bug-fixing of an older release. and power. The 1970s study did, in fact, compare the
Release 3.0A may be shipped to mainstream clients before results of loc and module based studies. It was shown that
2.0E is ready for delivery. As illustrated in figure 2, these were essentially similar but with the locs measure
release 2.0D precedes release 3.0A in real time and might, providing a less consistent picture of the evolutionary
therefore, appear to be its unique parent. Release 3.0A behaviour of the system than did the module count. Locs
will, however, have also inherited functionality and code are, therefore, considered inferior as a measure. The
first developed for and integrated into 2.0E. Therefore, in superiority of module count was explained by the
the evolutionary sense release 2.0E is (at least) also a observation that, however modules are defined, they have,
predecessor of 3.0A and is given the lower rsn. within a given domain, some degree of functional integrity
whereas locs have none [leh85]. The number of modules
in a specific system is also not, in general, dependent on
Release Size individual programmer practice. Module numbers may,
therefore, be expected to provide a more consistent
measure of system size and hence a better, though
admittedly coarse, indicator of system functionality.
3.0A
The function point (FP) [alb79] measure may be
considered an alternative measure for system functionality
or power. Their use does, however, raise some questions.
2.0E For example, how arbitrary are the interpretation of the
2.0D
Release definitions or the factor ratings achieved and, as a
Date consequence, how consistent are the results obtained by
different raters? Moreover, the establishment of the
measure requires judgment based on subjective measures
Fig. 2 Example of release ordering by d a t e and the overall determination is labour intensive and
(not to scale) difficult to automate [kem93]. There is also little, if any,
6
Or even, where productivity is measured in locs and the concern is
5
As per the first law of software evolution as reproduced in table 1. with productivity improvement.
experience in application of the measure to larger systems. The two plots display remarkably similar cyclic
Finally, it must be observed that, unlike module count, characteristics though, as one should expect, they differ in
FP data is not widely available from data archives of detail. They also resemble statistical process control charts
software systems across the industry. They are certainly [dav84]. It was, in part, the observation of this cyclic
not available from the systems being offered for study by pattern and its symmetry around the average with respect
the FEAST/1 collaborators. Module count is, therefore, to OS/360 that originally led to formulation of the third
being used as a size measure and, by implication, as a and fifth laws. The long term trend of the moving average
system power estimator, in the FEAST investigation. To of the incremental growth of E-type systems as they
date, and as illustrated by the results presented below, this evolve will, in general, be difficult to determine because of
decision appears to be justified. the small number of data points generally available. It
The growth in modules of FW over releases rsn1 to might be the case that this average declines because of
rsn21, that is releases 1.0 to 7.0B is shown in figure 3. increasing complexity. Alternatively, it might grow as a
The overall growth pattern should be compared with that consequence of improving process technology. It may also
of OS/360 over its first 20 or so releases as illustrated in be that these (and other?) contrary pressures compensate
figure 1. for each other over time so that the original assertion of
The abscissa of figure 3 represents the individual release invariance remains valid. It is, in fact, not certain that all
sequence numbers as explained above. The figure clearly systems or domains behave in the same way. The fifth law
shows the upward trend of system growth. The trend is, as stated in table 1 will have to be re-examined. The
therefore, consistent with the first and sixth laws of analysis outlined below will provide additional insight for
software evolution but does not distinguish between them. FW and will identify the growth trend model which, for
Moreover, it also shows a ripple effect strongly that system at least, is to be preferred.
reminiscent of that of OS/360 as in figure 1. It was, of
course, this ripple phenomenon that first suggested that Modules
1550 OS/360
the software process was stabilised by feedback control, as
1350 Incremental Growth
captured in the third and eight laws. Thus this initial result
of the present study is certainly compatible with the 1150
conclusions and, in particular, the laws of software
evolution, first reached in the study of OS/360 more than 950
twenty years ago. 750
Average Increment
550
3000 Size in Modules Logica FW
350
2500 Growth Trend 150
-50 RSN
2000
0 5 10 15 20
1500
Fig. 4 Incremental growth per release o f
1000 OS/360 over releases rsn2 to rsn21

500
400 Modules
0 Logica FW
RSN 350
0 5 10 15 20 Incremental Growth
300
Fig. 3 FW growth trend by rsn 250
Average Increment
200
4.3 Incremental growth
150
Figures 4 and 5 show the incremental growth per 100
release of OS/360 and FW respectively over the releases 50
rsn1 to rsn21 for each system. The horizontal line indicates
the average growth per release over this range. For FW the 0 RSN
plot includes all the data in table 1. For OS/360 the final 0 5 10 15 20
five releases for which data is available are omitted from
the plot since they reflect the transition (in growth trend Fig. 5 Incremental growth per release o f
terms) from OS/370 to VS1 and VS2. FW over rsn2 to rsn21
As pointed out previously [leh74,78], the cyclic effect Ei = (si - si-1)si-12 {i = 2,..., n} (2)
reflected by the peaks and troughs in the incremental Ei = (si - s1 )/(Σk=1i-1(1/(sk ) 2 )) {i = 2,..., n} (3)
growth plots may be indicating the presence of feedback
driven and controlled growth. Thus, influences tending to
increase system functionality, that is growth towards the Size in Modules
peaks, may have their source in positive feedback. The 2800 Logica FW
declines may reflect size stabilisation and other negative Least Squares Linear Fit (LSL)
2400
feedback effects. An example of such feedback is the
evolutionary pressure that arises when clients and users 2000
express a need for enhancements to existing capability or
1600
system extension. But as implementation of such changes
proceeds, the size and complexity of the system increases 1200
leading to declining comprehendability, increasing error
rates, increasing resistance to change or the impact of 800
budgetary constraints. These lead to a decrease of resources 400
available for, for example, growth as the resource demand
for fault fixing and complexity reduction increases [leh85]. 0
RSN
If sufficiently mature [hum95], the process will be directed 0 5 10 15 20
in its evolution and growth patterns by data reflecting such
needs. That is, the data or its derivatives will be used to Fig. 6 Least squares linear fit to FW over
adjust process objectives (immediate and/or long term) and rsn1 to rsn21
process parameters. It will be used to drive, constrain, and
in general, manage the process. Positive feedback drives Algorithm (2) uses only the two most recent data
growth while negative influences force a period of points in computing Ei. With (3) all data to rsni are
consolidation (correction and restructuring). An example of considered. In either case the average of the resultant set of
the consequences of excessive positive feedback may be Ei gives an estimated value for E. A third approach (LSIS)
provided by the final 7 releases, rsn20 to rsn26, of OS/360 computes E from the entirety of data using a least squares
(figure 1). A hypothesis that explains the system's criterion and is illustrated in figure 7.
apparently unstable behaviour over these releases is that it
was a consequence of excessive growth, in response to Size in Modules
market demand, in going from rsn19 to rsn20. 2800 Logica FW
This brief analysis suggests that the FW data supports,
in part at least, the third and fifth laws of software 2400 Inverse Square Fit (IS)
evolution as originally inferred from OS/360 study. 2000
Analysis of the long term growth trend of FW in the next
subsection suggests, however, that the wording of laws III 1600
and V as in table 1, must be modified.
1200
4.4 The Inverse Square model (IS) 800
400
This section presents two models of FW growth. The
first of these, illustrated in figure 6, is obtained from the 0
data set of table 2 using a least squares linear (LSL) fit. RSN
0 5 10 15 20
The models focus on the general trend and largely ignore
the ripple. Detailed analysis of the latter is beyond the Fig. 7 Least squares inverse square fit t o
scope of this paper. FW over rsn1 to rsn21
After investigating other possibilities Turski developed
an alternative, inverse square, model (IS) represented by The conceptual implications guiding the selection of
the nonlinear discrete-time dynamical recursion (1) [tur96]. one of the three alternative algorithms for computing E are
In this model si is the actual value of rsni, ^si is its fitted subtle and are not discussed further here. They yield
or predicted size, "n" is the total number of releases in the slightly different values for E but, in the context of this
data set and E is a model parameter. study, they do not produce significantly different
^s1 = s1 (1a) behavioural patterns. Nor do they change the conclusions
^si = ^si-1 + E/(^si-1) 2 {i = 2,…, n} (1b) to be drawn. Finally, the observant reader will notice
apparent outliers rsn20 and rsn21. No comment can be made
The parameter E is the average of individual Ei, at this time about the significance of these or their
calculated from either (2) or (3). possible implication.
For the trend models estimated from the full set of 21 output identified in the fourth law as being required to take
data points, statistical measures of the closeness of fit of the system from one release to the next. The principal
the LSL and IS models do not differ significantly. questions raised by this interpretation, questions not
Comparative assessment is, therefore, difficult on basis of satisfactorily answered, relate to the interpretation of E and
currently available data. This may be due to the fact that the units in which it is measured. Does E relate to the
the damping in the IS trend is not strong. Moreover, input effort required to achieve release by release system
neither model addresses the ripple. The deviations from evolution or to the output achieved from the process
smooth growth that the latter represents could, of course measured by some measure of increase in system quality
simply be noise, the compounded impact of many, and power? To answer the first question requires further
continuing, localised, often short term management and investigation and additional data. As to units, si is a
implementation decisions in which case it would not affect dimensionless count. Hence E is dimensionless. But
the assessment. The FEAST hypothesis suggested that, in despite these unsolved questions it is concluded that, on
part at least, the ripple is an indicator of the presence of the basis of currently available data, the above remarks,
feedback-controlled mechanisms that regulate the long together provide some justification for preferring the IS
term growth trend. The ongoing white box modelling model. It appears to reflect reality more closely.
activity in FEAST/1 represents a first step in the attempts The full implications of one further indicator of the
to resolve this issue, to permit refinement of the models superiority of IS over LSL must now be considered. When
and a more precise assessment of the degree to which they, modelling large data sets, the first part is often used to
their derivatives or different models reflect the reality of estimate model parameters and the second to then evaluate
the processes studied; and the degree to which they may be its "predictive" capability [ger93]. With the small size of
generalised. the data set available from FW, this might not appear to
Including the ripple will assist in comparative be a fruitful path to follow. Turski [tur96] did, however,
assessment of the model. It has, however, been pointed investigate this question, asking: "How many points
out already [tur96] that the phenomenology of the beginning with rsn1 have to be considered in order to get
situation suggests several reasons for preferring the IS an appropriately low error of fit, an acceptable predictive
model: capability?" In terms of the FEAST hypothesis this
• The IS inverse square property can be interpreted as question is equivalent to asking: How fast is the FW
reflecting the complexity growth of a software system dynamics established? An answer for FW is suggested by
over a sequence of releases. Such growth is due, in the plot of figure 8.
part, to increases in the complexity of the application, Figure 8 plots a set of mean absolute error of fit values
for example, as features not included in the original (maej {j = 2, ..., 21}, where j indicates the number of
system definition, and often orthogonal to it, are points from rsn1 used to compute E, see Appendix). The
added. Moreover, the process of evolution adds change values of mae2 and mae3 are relatively large. As j is
upon change upon change with, in general, little increased maej converges rapidly and reaches a relatively
attention paid to the resultant complexity growth steady value by j equal 6 (parameter E computed from the
[leh85]. It is this phenomenon that is captured by the first six releases only). Thereafter maej {j = 6, ...,21} has a
second law (table 1). mean of 74.6 with a standard deviation of 2.8. The mae6
• As a one parameter model IS is also compatible with value is only 4.7% of the system size at rsn6, 3.2% of its
the fourth law of software evolution, with the size at rsn19 and 2.8% of its size at rsn21. This behaviour is
parameter E reflecting the constant effort that the law counter-intuitive in several ways. Possible interpretations
identifies [leh78]. and implications are summarised below. Overall, it does,
• IS also satisfies the Principle of Parsimony [cox66]. however, appear to indicate the strength of the system
• No system can grow forever. The linear growth model dynamics. This phenomenon supports the observation
is thus incompatible with reason and common made by one of the authors many years ago with regards to
experience. OS/360 evolution that "Rather than the managers
managing the (evolving software) system, the system
4.5 Further consideration of the Inverse Square manages the managers." It must, of course, be understood
model that the reference here is to long term evolution, not to the
specifics of individual decisions, often localised in time,
The list of reasons for favouring IS over the LSL system space and implementation space.
includes the observation that the single parameter E of the • Figure 8 based on the IS model suggests that the FW
former may be interpreted as a constant effort parameter as growth trend is established over some six of the
predicted by the fourth law. Estimation of E from the releases included in the study. In accordance with the
available data strengthens that argument. Such estimation FEAST hypothesis, it is assumed that the dynamics
produces a value that, as shown below, remains relatively arises from the characteristics of the software, the
constant as FW evolves. That is, the single parameter of organisations developing, marketing and using the
the model may be interpreted as the constant effort or work software, the communications between them and the
controls that are exercised. In any event figure 8 • Note that the mae of LSL over the stable range is, at
supports the hypothesis that the E-type systems 86 modules, even closer to the average incremental
evolution process develops strong dynamics. growth of 86.1 modules than is that of IS. The
• The mae for IS of 74.6 modules with standard implications of this, for example on the evaluation of
deviation of 2.8 over the stable range is very close to the relative value of the two models requires more
the calculated average incremental growth of about investigation.
86.1 modules over all data points (fig. 5). This raises • The IS plot in Figs. 8 and 9 stabilises much more
the question whether there is some relationship rapidly than does the LSL plot. Moreover, if IS and
between the variance of the ripple (which is a LSL are estimated by using only rsn1 and rsn2, the
significant source of error for the trend fit) and the former outperforms the latter by an order of
mean incremental growth. Establishing a correlation magnitude. Thus while there are still unanswered
would lead to a concept of safe growth rate limits. questions, figures 8 and 9 appear to support the earlier
Establishing either would provide strong conceptual conclusion that IS is to be preferred over LSL. That
support for the incremental or evolutionary release they provide further support for the FEAST
strategy [gil88]. The entire question remains to be hypothesis and the laws of software evolution does
investigated. not require further emphasis
The results presented above are based on the
Modules examination of the FW system, investigation of OS/360
300 Logica FW not having yet been reopened. Continued investigation of
Mean Absolute Error over All Releases these and other systems is clearly required.
250
as Function of Number of Data Points
200 Used to Estimate E 4.6 Impact of the study on the laws of software
evolution
150 IS
More work is clearly required for firm conclusions to be
100 reached in regards to the many issues raised above. It is
nevertheless considered appropriate to indicate in table 3
50 the extent to which the investigators feel encouraged to see
the present results as being compatible with, or even
0 # of supporting, the laws of software evolution. The weight of
0 5 10 15 20 Point evidence suggests that, despite the 20 year gap and the
significant difference between IBM and Logica systems and
Fig. 8 Mean absolute error of fit to FW over their development and operational environments, there are
all releases as function of number o f strong similarities in the phenomenology of their
points used to estimate IS model evolutionary growth. It is believed that the results of the
studies to date will, with some modification, extend to E-
type systems in general. The FEAST/1 project will, it is
hoped, receive sufficient data from the evolution processes
3000 Modules
Logica FW of a variety of systems to establish confidence in a set of
Mean Absolute Error over All Releases conclusions that are valid in some stated domain or, of
2500 course, to demonstrate that they cannot be generalised.
as Function of Number of Data Points
2000 Used to Estimate Models
5 Final remarks
1500 LSL
The results achieved so far by applying this method in
1000 the FEAST/1 project are encouraging. Additional data on a
wide spectrum of software systems to be received from
500 IS various industrial collaborators should, if consistent,
permit generalisation of both the conclusions reached and
0 # of the measurement and analysis techniques being employed.
0 5 10 15 20 Points The present paper describes the black box approach that
has revealed aspects of FW evolution and of its evolution
Fig. 9 Mean absolute error of fit to FW over dynamics, has provided material for interpretation and for
all releases as function of number o f the formulation of explanatory hypotheses. A white box
points used to estimate LSL model modelling approach is simultaneously seeking to model
(squares) superimposed on that o f the structure of other industrial software processes and to
IS model (circles) simulate their behaviour including their feedback control
loops. These investigations are being further backed up detect, measure and control feedback phenomena and their
through the development of a multi-agent model. It is impact is believed to be key to major advances in software
hoped that this work will confirm, perhaps modified process management and execution.
versions of, the laws of software evolution [leh96d] that In view of the fact that this paper will be presented at
now include the FEAST hypothesis and, put them on the Metrics '97 symposium it is appropriate to comment
firmer foundations. If successful over a range of systems, on its focus on the FEAST hypothesis, the related
the investigation will provide a base for a plausible theory FEAST/1 project and the absence of references to other
of software process and software evolution. The relevant metrics work [fen96,ieee94,kit82,96,vot95].
alternative, that the results of the investigation FEAST/1 is believed to exemplify an original metrics
demonstrate that the laws and the hypothesis are not of based approach to the study of the software process and
general relevance though satisfied for particular instances software evolution. This approach has been consistently
of E-type systems and their evolution processes cannot, at followed from the first primitive study of OS/360 in the
this stage, be dismissed. late sixties and seventies [leh69,85] to the current
The FEAST/1 study has already made visible progress investigation. The study was triggered by a general
in illustrating how measurement concepts can be applied observation; the universal and persistent problems
to the study of software evolution. It has successfully accompanying software development and maintenance, ie.
extended the 1970s techniques by applying more rigour software evolution. Following recognition of the problem
[law82] to mastery of the observed phenomena. The as appropriate for research investigation [leh69,85] and
specific results derived are of considerable interest, both in receipt of appropriate data, first from OS/360 and more
themselves and from a wider perspective. The long term recently from Logica FW [leh96d], patterns and
significance of this paper is, however, more likely to be in regularities in their evolution were revealed and modelled.
the approach and techniques it presents. Being able to Interpretation of the models led, in turn, to the generation

No. Brief Name Support Indicator


I Continuing Change √ Fig. 3 clearly indicates continuing growth. Logica's confirmation that
this is partly due to adaptation and change supports the law.
Quantification will be of interest.
II Increasing Complexity √ The inverse square law of growth (eq. 1) and its predictive power (fig. 7)
supports complexity as a constraining factor.
III Self Regulation ? The ripple (fig. 3) of the, otherwise, smooth growth (eq. 1) suggests
regulation around a smooth trend. Identification of the underlying
mechanisms is required to support the law as it stands.
IV Conservation of √ The ability to obtain a close fit and very good predictive power with a
Organisational Stability single and constant parameter E (eq. 1) provides support. Measures of the
(invariant work rate) work rate are required.
V Conservation of Familiarity ?7 Fig. 5 still suggests that the average incremental growth has a definite
trend. Its invariance as in the original formulation is now, however,
questioned. Determination of the trend and the consequences of a release
whose incremental growth exceeds the average significantly must await
the further behaviour of the system in its evolution.
VI Continuing Growth √ Fig. 3 clearly indicates continuing growth. Logica's confirmation that
this is partly due to functional growth supports the law. Quantification
will be of interest.
VII Declining Quality ? No data that provides evidence for or against is available.
VIII Feedback System √ Regulation as in figs. 3, 5, 7, 8 and inverse square law, (eq. 1) are
supportive. Feedback control mechanisms must be identified to obtain
further support.

Table 3 The laws of software evolution in the light of the preliminary FW analysis7

7
It is hoped to obtain more data that will provide evidence, one way or the other.
of hypotheses (eg. the laws and FEAST) to interpret them. [bec94 ] Becker RS, Hall B, and Rustem E, Robust Optimal
These successive steps led to an iterative investigation that Control of Stochastic Nonlinear Economic
is now yielding further data (historical and/or obtained by Systems, J. of Economic Dynamics and Control, n .
experimentation and measurement) to support, refute or 18, 1994, pp. 125 - 148
[bel72] Belady LA and Lehman MM, An Introduction t o
modify and then to extend and generalise the emerging Growth Dynamics, Proc. Conf. on Statistical
theoretical base and framework. Such results must, of Comp. Perf. Evaluation, Brown Univ. 1971,
course, be continually validated or adjusted by observation Academic Press, 1972, W Freiberger (ed.), pp. 503 -
of and experimentation in actual industrial processes. Thus 511
the more general relevance of the paper to the metrics [bel78]* id., Characteristics of Large Systems, Proc. Conf.
community is in its approach which may be compared Research Directions in Software Technology, (P.
with those more widely adopted. Wegner ed.), Sponsored by Tri-Services Committee
Apart from any theoretical advance that this study will of the DOD, Brown U. Providence, RI, Oct. 1878,
provide, it should, if successful, lead to the development MIT Press, 1979, pp. 106 - 142
of methods and tools for process management, release [box70] Box GP and Jenkins GM, Time Series Analysis,
Forecasting and Control, Holden-Day, San
planning and process improvement. This will shape the Francisco, 1970, 553 pps.
direction of software metrics, software process modelling [cox66] Cox DR and Lewis PAW, The Statistical Analysis
and process improvement in the years to come. If the of Series of Events, Methuen, London, 1966
extent to which feedback phenomena in E-type evolution [dav84] Davis OL and Goldsmith PL, Statistical Methods i n
processes shapes and constrains the software process Research and Production, 4th. ed., Longman,
significantly, mastery and command of that phenomena London, 1984, 478 pps.
will open up important new prospects. Moreover, the [fea94,5] Preprints of the three FEAST Workshops, Lehman
software process is a special case of business processes, in MM (ed.), Dept. of Comp., ICSTM, 1994/5
general [leh97]. The approach applied and the conclusions [fen96] Fenton NE and Pfleeger SL, Software Metrics - A
reached should find much wider application. It is believed Rigorous and Practical Approach, 2nd ed., PWS
Publ. Co., London, 1997, 638 pps.
that FEAST/1 is a study which, if successful, will [for61] Forrester JW, Industrial Dynamics, Productivity
eventually lead to a theory and to a technology which Press, Cambridge, MA, 1961
together can trigger major advances in the software and [for70] Forrester JW, Understanding the Counter Intuitive
other business processes and their improvement. Behaviour of Social Systems i n Systems
Behaviour, ed. by Open Systems Group, 3rd. Ed.,
pp. 270-287, Paul Chapman Publishing Co. and
6 Acknowledgements The Open University. London, 1972
[ger93] Gershenfeld NA and Weigend AS, The Future o f
We are grateful to Logica plc for providing access to the Time Series: Learning and Understanding, in Time
FW data and in particular to Joe Halberstadt for his Series Prediction: Forecasting the Future and
collaboration. Sincere appreciation is also due to Profs. Understanding the Past, Gershenfeld NA and
Weigend AS (eds.), SFI Studies in the Sciences of
Berc Rustem and Vic Stenning, co-Principle Investigators Complexity, Proc. Vol. XV, Addison-Wesley,
on the FEAST/1 project, for their many contributions to 1993, pp. 1-70
the investigation and to Dr. Emma McCoy of the ICSTM [gil88] Gilb T, Principles of Software Engineering
Mathematics Department for her help with statistical Management, Addison Wesley, 1988
aspects of this investigation. We also acknowledge the [göd31] Gödel K, Über formal unentscheibare Sätze der
constructive contributions of participants in the three open Principia Mathematica und verwandter Systeme, I,
FEAST workshops in 1994/5. Last but not least our Monatshefte für Mathematik und Physik 38, 1931,
thanks to the anonymous referees for their careful reading pp. 173-1198. English translation, On Formally
and constructive comments. Since October 1996 the work Undecidable Propositions,Gödel K, Basic Books,
reported here has been supported under EPSRC grants New York, 1962
[hum95] Humphrey WS, A Discipline for Software
numbers GR/K86008 and GR/L07437. Engineering, SEI Series in Software Engineering,
Addison-Wesley, Reading, MA, 1995, 789 pps.
7 References8 [ieee94] Measurement Based Process Improvement, sp. iss.
IEEE Softw., IEEE Comp. Soc. v. 11, n. 4, July
[abd91] Abdel-Hamid T and Madnick SE, Software Project 1994
Dynamics - An Integrated Approach, Prentice-Hall, [kem93] Kemerer CF, Reliability of Function Point
Englewood Cliffs, 1991, 264 pps. Measurement: A Field Experiment, CACM v. 3, n .
[alb79] Albrecht AJ, Measuring Application Development 2, Feb. 1993, pp. 85 - 97
Productivity, Proc. Guide/Share: IBM Application [kit82] Kitchenham B, System Evolution Dynamics o f
Development Symposium, Monterey, CA, 1979, VME/B, ICL Tech. J., May 1982, pp. 42 - 57
pp. 83 - 92 [kit96] id., Software Metrics: Measurement for Software
Process Improvement, NCC Blackwell, 1996, 241
8
Papers identified by a * in the reference listing are reprinted in pps.
[leh85].
[law82] Lawrence MJ, An Examination of Evolution [mat92] MATLAB High-performance Numeric Computation
Dynamics, Proc. 6th. Int. Conf. On Softw. Eng., and Visualisation Software - Reference Guide, The
Tokyo, Japan, 13 -16 Sept. 1982, IEEE Comp. Soc. MathWorks, Inc., Natick, MA, 1992
Ord.N. 422, IEEE Cat n. 81CH1795-4, pp.188-196. [mca95] McCabe FG and Clark KL, Programming in April:
[leh69]* Lehman MM, The Programming Process, IBM Res. An Agent Process Interaction Language, i n
Rep. RC 2722, IBM Res. Centre, Yorktown Intelligent Agents, Springer Verlag, 1995.
Heights, NY 10594, Sept. 1969 [mcg93] McGowan CL and Bohner SA, Model Based Process
[leh74]* id., Programs, Cities, Students, Limits to Growth?, Assessments, Proc. 15th Int. Conf. on Softw. Eng.,
Inaugural Lecture, May 1974, Publ. in Imp. Col of Baltimore, MD, 17 - 21 May 1993, IEEE Comp.
Sc. Tech. Inaug. Lect. Ser., vol 9, 1970, 1974, pp. Soc. ord. n. 3700-02, pp. 202-211
211 - 229. Also in Programming Methodology, [mea72] Meadows DH et al, Limits to Growth, Signet, 1972
(Gries D , ed.), Springer, Verlag, 1978, pp. 42 - 62 [tur96] Turski W, Reference Model for Smooth Growth of
[leh77] Lehman MM and Patterson J, Preliminary CCSS Software Systems, IEEE Trans. on Softw. Eng., vol.
System Analysis Using Techniques of Evolution 22, n. 8, 1996
Dynamics, Working Papers, First Software Life [vot95] Votta LG and Zajac ML, A Design Process
Cycle Management Workshop, Airlie VA, 1977, Improvement Case Study Using Process Waiver
publ. by ISRAD/AIRMICS, Comp. Sys. Com., US Data, Proc. ESEC '95, Sitges, Barcelona, Spain, 2 5
Army, Fort Belvoir, VA, Dec. 1997, pp. 324 - 332 - 28 Sept. 1995
[leh78]* id., Laws of Program Evolution - Rules and Tools [wae94] Waeselynck H and Pfahl D, System Dynamics
for Programming Management, Proc. Infotech State Applied to the Modelling of Software Projects, i n
of the Art Conf., Why Software Projects Fail, - Apr. Software - Concepts and Tools, v. 15, Springer-
1978, pp. 11/1 11/25 Verlag, Berlin, 1994, pp. 162 - 174
[leh80a]* id., On Understanding Laws, Evolution and [wil51] Wilkes M V, Wheeler D J and Gill S, The
Conservation in the Large Program Life Cycle, J. of Preparation of Programs for an Electronic Digital
Sys. and Software, v. 1, n. 3, 1980, pp. 213 - 221 Computer, Addison Wesley Press Inc., 1951, 167
[leh80b]* id., Programs, Life Cycles and Laws of Software pps.
Evolution, Proc. IEEE Spec. Iss. on Softw. Eng., v .
68, n. 9, Sept. 1980, pp. 1060 - 1076
[leh85]* Lehman MM and Belady LA, Program Evolution - Appendix
Processes of Software Change, Academic Press,
London, 1985, 538 pps. This appendix indicates how the values of the mean
[leh89] Lehman MM, Uncertainty in Computer Application average error of fit (mae), as in section 4.5 figures 8 and 9,
and its Control Through the Engineering o f have been computed. It also records the equations used to
Software, J. of Software Maintenance: Research and compute the least squares linear (LSL) and inverse square
Practice, v. 1, n. 1, Sept. 1989, pp. 3 - 27 (IS) models.
[leh94] id., Feedback in the Software Evolution Process,
Keynote Addr., CSR Eleventh Annual Workshop o n
As explained in section 4.5, for each of the models LSL
Softw. Evolution: Models and Metrics. Dublin, 7- and IS, a set of maej values {j=2,...,21} was computed to
9th Sept. 1994, in Information and Softw. Tech., determine the effect on the error of fit of the number of
sp. Iss. on Softw. Maintenance, v. 38, n. 11, 1996, points used in the estimation. The average error over the
Elsevier, 1996, pp. 681 - 686 entire data set for each such number of points was then
[leh96a] id., Process Improvement - The Way Forward, taken as a measure of the goodness of fit for that model.
Invited Keynote Address, Proc. Brazilian Softw. Each set of maej was calculated from the expression:
Eng. Conf., SBES'96, Universidade Federal de Sao
Carlos, Brazil, 1996, pp. 23 - 35 maej = (1/n) Σk=1n |sk - ^sk,j| (A.1)
[leh96b] Lehman MM, Perry DE and Turski WM, Why is i t
so Hard to Find Feedback Control in Software where throughout the appendix:
Processes?, Inv. Pres., Proc. 19th Australasian n (= 21 for the FW set) is the number of data points
Comp. Sc. Conf., Melbourne, Austr., Jan. 31 - Feb. used in calculating mae;
2, 1996 j {j = 2,...,21} represents the number of data points
[leh96c] Lehman MM and Stenning V, FEAST/1: Case for being used to estimate the LSL and IS models,
Support, ICSTM - DoC EPSRC Proposal, March respectively;
1996 sk is the actual system size for the release with sequence
[leh96d] Lehman MM, Laws of Software Evolution number rsnk (table 2);
Revisited, Position Paper, EWSPT96, Oct. 1996,
LNCS 1149, Springer Verlag, 1997, pp. 108 - 124
^sk,j represents the fitted system size for rsnk , with
[leh97] id., Process Modelling - Where Next?, ICSE 9 Most sub-index j indicating that the corresponding model
Influential Paper Award, Proc. ICSE 19,Boston, (either LSL or IS) has been computed using the first j
MA, 20 - 22 May 1997, pp. 549 - 552 points of the data set only.
[mad96] Madachy RJ, System Dynamics Modelling of an Similarly, ^sk is used below to represent the fitted
Inspection Process, 18th Int. Conf. On Softw. system size based on LSL whose parameters have not, or
Eng., Berlin, 25-29 March 1996, IEEE Comp. Soc. IS whose parameter has not, necessarily been adjusted to
Ord. N. PR07246, pp. 376-386 minimise the error of fit in some sense.
Computation of ^sk,j Ei in expression A.4 will have been computed either
Least Squares Linear Model. In this case ^sk,j is from expression 2 or 3 (section 4.4). Ej may also be
expressed as follows: computed using the least squares criterion and is then the
value of E which minimises the error of fit, dj, over the
^sk,j = aj.k + bj {k = 1,…, n} (A.2) first j points, expressed as:
The parameters aj and bj are computed for each value of min (d ) = min (Σ
E k=1 (sk - ^sk ) ) (A.5)
j 2
j using a least squares linear regression, as provided by E j

most statistical packages and spreadsheets, to minimise where minE(.) indicates the minimum value of (.) over the
Σk=1j (sk - ^sk ) 2 . entire range of the parameter E. Expression 1 (section 4.4)
shows how ^sk may be computed.
Inverse Square Model. For this model, each value of For the FW data choosing one or other of these
^sk,j is computed recursively from s1 [tur96]: approaches has only minimally effect on the results. The
choice has no significant impact on the interpretation of
^s1,j = s1 (A.3a) these results or on the conclusions reached.
^sk,j = ^sk-1,j + Ej /(^sk-1,j) 2 {k = 2,…, n} (A.3b) Figures 7, 8 and 9 (IS plot) in section 4.5 are based on
expression A.5. This has been implemented under
where
MATLAB [mat92] and is available on the Web at
Ej = (1/(j-1)) Σi=2j Ei {j = 2,..., n} (A.4) http://www-dse.doc.ic.ac.uk/~mml/feast1/.

mml568[papers]
18/8/97

You might also like