Academia.eduAcademia.edu

Software Development as an Engineering Problem

1982, Wirtschaftsinformatik / Angewandte Informatik - WI

Abstract: It is hoped that software development can become a branch of engineering, but there are important differences. Software is intangible, complex, and capable of being transformed by a computer. Much effort has been devoted to overcoming the difficulties due to intangibilityand complexity, but too little has been devoted to exploiting the third characteristic. A processoriented view of software may lead to substantial improvements,in this and other respects. Stichworte: Entwurf, Entwicklungsmethoden, Implementation, Software Engineering, Standardisierung. Zusammenfassung: Es wird allgemein erwartet, dass sich die Softwareentwicklung zu einer ingenieurmässigen Disziplin

Software Development as an Engineering Problem Michael A. Jackson, London Key-words: Design, development methods, implementation, software engineering, standardisation. Abstract: It is hoped that software development can become a branch of engineering, but there are important differences. Software is intangible, complex, and capable of being transformed by a computer. Much effort has been devoted to overcoming the difficulties due to intangibility and complexity, but too little has been devoted to exploiting the third characteristic. A processoriented view of software may lead to substantial improvements in this and other respects. Stichworte: Entwurf, Entwicklungsmethoden, Implementation, Software Engineering, Standardisierung. Zusammenfassung: Es wird allgemein erwartet, dass sich die Softwareentwicklung zu einer ingenieurmässigen Disziplin wandelt. Aber hier gibt es noch einige wichtige Probleme: Software ist immateriell and komplex, and sie kann mit Hilfe eines Computers transformiert werden. Es sind grosse Anstrengungen gemacht worden, die Schwierigkeiten, die sich daraus ergeben, dass Software immateriell and komplex ist, zu überwinden, jedoch viel zu wenig, um auch die dritte Eigenschaft auszuschöpfen. Eine prozessorientierte Betrachtung von Software kann in vielerlei Hinsicht zu wesentlichen Verbesserungen führen. The phrase ‘software engineering' was chosen as the title of the NATO conference which took place in Garmisch in October 1968 [1 ]. The phrase was ‘... deliberately chosen as being provocative, in implying the need for software manufacture to be based on the types of theoretical foundations and practical disciplines that are traditional in the established branches of engineering'. Fourteen years after that conference, it is still hard to resist that implication. We still need a wide range of practical disciplines, and practical disciplines need theoretical foundations. We still look wistfully at the established branches of engineering, hoping to model our own activities after the pattern that has served them so well. But it is not clear that there is any single pattern. There are many branches of engineering, differing greatly one from another. The automobile engineer, designing a new automobile model, is doing something very different from a civil engineer designing a dam; an electronic engineer has little in common with a chemical engineer, an aeronautical engineer with a mining engineer. We should not necessarily expect that a software engineer will be like all of them, when they are so unlike each other. There is no one discipline of physical engineering; instead, there are many disciplines, each broadly characterised by the products which its practitioners are competent to design. Should software engineering model itself on one of these disciplines? If so, which one? Or should software engineering be more eclectic, taking what it can from each of the established branches of engineering to form a new synthesis? Perhaps the idea of a single discipline of software engineering is itself mistaken. Perhaps we need to recognise distinct branches of software engineering, formalising the specialisations that are already apparent: certainly there are specialists in compiler construction, in operating system design, in various aspects of artificial intelligence, in computer graphics, and in several other fields. It may be that no single discipline of software engineering can apply to all of these fields without being so general that it is useless for practical application: theory can be general, but practice must be specific. Even so, at the present early stage in the development of software engineering we can reasonably direct our attention to those characteristics which are common to all or most software, drawing the appropriate contrasts or analogies with branches of physical engineering, and hoping to increase our understanding of present practice and future possibilities. The most obvious characteristic of software is its intangible, immaterial, nature, distinguishing it sharply from the products of physical engineering. A software product is no more a reel of tape or a floppy disk than a piece of music is the paper on which its score is written. From this intangible nature many consequences flow, some beneficial and some potentially harmful. Another, related, characteristic is that the software product is itself capable of being reproduced, analysed, AngInf82.doc 02/11/02 Page 1 and transformed by computer; it seems that we have not yet taken full advantage of this capability, especially the capability of transformation. A third characteristic is a high level of complexity: most software is very complex, and a central purpose of software development methods is to master this complexity, to allow us to build software that is complex but highly reliable. These characteristics, and some of their consequences and implications are discussed in the sections that follow. A concluding section draws together the threads of the discussion, and suggests some possible directions for future development. 1 The Intangible Product The engineer whose product is a physical object works under severe constraints. He must take account of the physical limitations of the materials he uses, choosing appropriate material for each purpose from a relatively small set of available materials. A load-bearing wall must be built of a material with a high compressive strength; the chains or cables of a suspension bridge must have high tensile strength: a window must be transparent; an automobile tyre must be resilient. Both the chosen material and the shape into which it is formed must be adequate to the purpose. The engineer must also consider how the materials will behave over the lifetime of the product. Some materials corrode; some cannot withstand high or low temperatures without long-term degradation; the junction between two dissimilar metals may suffer electrolytic action; a building material may be vulnerable to wind erosion. The software engineer, whose product is intangible, is almost entirely free from such constraints. The ultimate constituents of the product are the statements of the programming language, and these constituents are, in effect, infinitely strong and entirely free from degradation over time. The addition operator works just as well on large numbers as on small numbers; it does not become worn out after a million additions; it works equally well whether the adjacent operator is a multiplication or an assignment This freedom from constraint is at once an advantage and a difficulty. The advantage is that quite a small set of statements can be used to program any computable function; the difficulty is that there are infinitely many ways of constructing any program, and too few criteria for choosing one way. More constraint is needed. The need for more constraint has been recognised from the early days of computing. M. V. Wilkes, writing about his experiences [2] in machine-language programming on the EDSAC in 1950, gives a nice illustration: “... the integration was terminated when the integrand became negligible. This condition was detected in the auxiliary subroutine and the temptation to use a global jump back to the main program was resisted; instead an orderly return was organised via the integration subroutine. At the time I felt somewhat shy about this feature of the program since I felt that I might be accused of undue purism, but now I can look back on it with pride.” Essentially, the recognition was that a program must be constructed of larger constituent parts than the elementary programming language statements, and that these larger parts must themselves be fitted together in an orderly fashion. An immediate question arises: what should these larger parts be? The implicit answer for Wilkes in 1950 was that the parts should be subroutines; Wilkes is credited with the invention of the subroutine. This answer was widely accepted for many years, and still has much validity. Fortran, Algol 60, COBOL and PL/I all offer a subroutine construct as the primary part type for program structuring; modular programming and some simpler versions of structured programming rely heavily on it also. AngInf82.doc 02/11/02 Page 2 Subroutines, or procedures, can be seen as enlarging the repertoire of machine operations. A machine without a hardware multiplier can be equipped with a multiplication subroutine; a machine without matrix arithmetic can be equipped with a set of subroutines providing the necessary matrix operations; a machine without a ‘calendar date' data type can be equipped with subroutines to compare, increment and decrement dates. This is a powerful idea, but much less than a complete solution to the difficulty. With subroutines we can build a sequential process, but there is a need for concurrency also. The essence of concurrency is that during execution of a program two of its parts can both be in non-initial states, although neither is a subroutine of the other. A crude form of concurrency in this sense can be obtained by using ‘own variables' in procedures: but this is an uncomfortable technique and leads to obscure and difficult programs. A far better solution is provided by coroutines, which allow a program to be constructed from parts which are themselves sequential processes, communicating by passing control from one to another [3]. The availability of the sequential process as a part type frees the software developer from the unwelcome need to view every problem and every program as a hierarchy. Procedures must be organised into hierarchies, but processes are naturally organised as networks: increasingly, as we tackle more ambitious development tasks, we find problems that can not be readily fitted into the hierarchical mould. Communication among sequential processes need not be by direct control flow, as in coroutines. Processes may communicate by shared variables [14], with suitable provision for synchronisation and mutual exclusion; they may communicate by sending and receiving streams of messages [5]; either buffered or unbuffered; they may communicate by shared events which require the participation of two or more processes [6]. All of these forms of communication allow a ‘true concurrency': any process may proceed at any time if it is not held up by communication with another process and if a processor is free. A different line of development leads to the idea of program parts which are instances of abstract data types [7]. The part is a data object, packaged with the operations which can be performed on that object; the definition of an abstract data type hides information about the representation of the object and the implementation of its operations. We have here been considering procedures, processes, and abstract data types as general constituent parts from which programs may be constructed; their value, from this point of view, is that they offer the possibility of development in terms of a smaller number of larger parts rather than a larger number of the very small parts which are the statements of the programming language. There is an entirely different question which we will consider later: given a set of part types, and a specific problem, which parts of which types are needed to solve the problem, and how should they be connected together? 2 Designing and Building In the established branches of engineering there is a clear distinction between designing a product and building it. The design work is an intellectual activity, using intellectual tools and techniques on intellectual material. The building work is a physical activity, creating physical products from physical material. This distinction between the intellectual and the physical has important consequences. One consequence is that the activities are naturally separated, and performed by different people. The engineer who designs a bridge does not build it with his own hands; he passes his design to a separate construction organisation. The automobile engineer who designs a car does not work in the factory that produces it. Because of this separation of people and responsibilities, and because of the need to deal correctly with the characteristics of the physical materials used, the design must be fully detailed. The automobile engineer does AngInf82.doc 02/11/02 Page 3 not leave the factory manager the freedom to decide the shape of the engine's combustion chamber or the size of the wheels; the civil engineer does not allow the builder to choose the quality of steel or the formulation of the concrete. The interface between design and building is, within the limits of human fallibility, exact. Another consequence is that costs can be allocated between the two, and that the costs of physical construction are almost always much greater than the costs of design. Building a bridge costs much more than designing it. Where a product is made in large numbers, as are automobiles, aeroplanes, computers and washing machines, the cost of the complete production run is what matters here, not the marginal cost of one additional unit. Because the physical production cost is so high, the design cost is a relatively small part of the total cost. It is therefore perfectly reasonable to carry out the whole design work more than once before embarking on production. Many automobile designs reach the stage of completed prototypes and are then abandoned without a single production model being built. A third consequence is that the engineer's work can be examined both in its intellectual and in its physical manifestations. A colleague can look at the design documents critically, checking the design before it is committed to production. Customers and competitors can, and do, look at the finished physical product, discerning its internal structure and design as well as its behaviour in use. Most physical engineering products make their design public property in his sense, and the engineer could not hide the design even if he wished to. The automobile engineer cannot conceal from his customers and competitors the fact that he has decided to place the engine transversely, or that he has used independent rear suspension, or that the engine has six cylinders. The situation is very different in software engineering. The internal structure of a software product can be largely concealed, especially if the product is delivered in directly executable machine language form. Many of the designer's choices are therefore largely hidden from the world, and, in particular, from his customers and competitors. The production costs of software are relatively small. Where a software package is produced and sold to many customers, the marginal cost of producing each copy is entirely insignificant. Within the development process itself, we may consider there to be a distinction between designing and building; but however we draw that distinction we will find that the cost of building does not dominate the cost of designing as it does for the established branches of physical engineering. Nor is it clear that the distinction between designing and building exists at all for software. Both the design and the product itself are intellectual and intangible, and no convincing separation can be made. There is no point at which the development team puts away the drawing board and begins to use the lathe, or to pour concrete. Some efforts have been made to separate design from programming, sometimes by defining a design language which is distinct from the programming language. For example, the designer might produce a hierarchical diagram showing that the program is to consist of certain procedures connected in a certain way; or a network showing processes and their connections [8]. But such a design is grossly incomplete; it is like a design for a bridge which states only that the bridge is to be a suspension bridge with two piers 50 metres high and a span of 250 metres. The separation that has been made is not a separation of design from programming; it is rather a separation of preliminary design from detailed design. Other design languages have been proposed which are essentially forms of ‘pseudocode': the design is itself a program, but a program written in a language for which no compiler is available [9]. If the design is complete, it is appropriate to obtain or create a compiler for the design language, thus automating the building activity. If it is AngInf82.doc 02/11/02 Page 4 incomplete, then again the separation is merely between preliminary and detailed design. In software, the design is the product. 3 Specifying and Implementing Abandoning the attempt to distinguish design from building, we will use the term ‘implementing’ to replace both of them. It seems fruitful then to draw a different distinction, between specifying and implementing. A specification states the criteria by which the product will be judged. For example, an auto amplifier may be specified by stating the required gain, harmonic distortion, noise level, output power, and so on. A bridge may be specified by stating traffic patterns, height above the channel which it crosses, the roadways to be connected, and so on. A procedure to compute the sin function may be specified by stating the range and format of the argument and the precision required in the result. An inventory control system may be specified by stating the required reduction in inventory holding costs and the required service level for a given pattern of customer demands. It is then the task of the engineer to find and implement a satisfactory solution to the stated problem. But these are all implicit specifications, and are not typical of the specifications found in software engineering, which are most often explicit rather than implicit. C. B. Jones defines [10] the difference: “The explicit [specification] is analogous to a program. ... The essence of an implicit specification is to state the relationship required between arguments and results without having to write an explicit rule for computing the latter from the former.” We could not, for example, give a implicit specification of a payroll program or system; we must specify exactly the rules for computing the gross pay and the tax and other deductions under all the circumstances that can arise. We cannot give an implicit specification for a syntax checker: we must specify all the acceptable syntactic combinations, usually by giving the grammar of the language to be checked. This need for explicit specifications has caused a difficulty in separating out the specification task from the rest of the development activity, much like the difficulty of separating design from programming. This has been especially evident in data processing, where specifications have often been written which are essentially natural language programs, leaving the customer dissatisfied with a specification he cannot understand, and the design or programmer dissatisfied with the narrowing of his task to that of mere translation. It has seemed that, just as the design is the product, so the specification is the design. 4 Processes and Procedures But the specification is not the design  or it should not be. We can certainly distinguish between the exact specification of a set of connected sequential processes and the arrangement of those processes so that they can run efficiently on the machine. This will be a natural way of viewing the development of software systems whose subject matter is sequentially ordered in time. Specification captures the time ordering of the real world events; implementation rearranges the ordering, within the freedom given by the specification, to fit the machine. Consider, for example, a payroll system, whose subject matter is the behaviour of the employees to be paid: their working hours, perhaps their production achievements, their holidays and periods of sickness, their promotions, their eventual retirement. It is natural to specify such a system by stating the time-ordering of the events affecting one employee, and the entitlement to pay based on those ordered events. The resulting specification is a sequential process, whose execution models or simulates the behaviour of the employee over the whole period of employment. For an AngInf82.doc 02/11/02 Page 5 organisation with 10,000 employees, there would be 10,000 instances of this process to be executed. No existing operating system is capable of running these 10,000 processes directly with acceptable efficiency; it becomes the task of the implementer to rearrange these processes into a form allowing efficient execution. Typically, this form will be a set of 10,000 ‘employee database segments' together with an updating procedure which is executed on a database segment whenever an event occurs for the associated employee in the real world. We can regard this rearrangement essentially as a scheduling, at implementation time, of the set of processes. The most important aspects of the rearrangement will be: • conversion of the sequential processes into procedures, so that the behaviour of a process when a relevant event occurs can be treated as an execution of an invoked procedure; • choosing a representation of the process activation records in terms of a suitable database system; • choosing a scheduling of the processes so that response to each event is fast enough and the whole system runs efficiently. Conversion of the processes into procedures can be mechanised; choice of the database representation will depend on the particular database system to be used; the chosen scheduling of the processes must be expressed in an explicit scheduling algorithm to be devised by the implementer. 5 Development Methods In software engineering we pay much attention to the subject of development methods, certainly much more than is paid in the established branches of engineering. Indeed, it sometimes seems as if there is little in software engineering other than development methods. Where other engineers seem to talk about the products of their activities, and the characteristics of those products, software engineers talk about their activities directly, and about the characteristics of those activities. In part, this is due to the intangible nature of the software product and to the lack of a physical manifestation that can be readily examined and critically evaluated. In part, it is due to the comparative youthfulness of the field, and to the lack of self-confidence in its practitioners: social scientists too spend a lot of time discussing their methods. In part, it is due to the combination in software of great complexity with a stringent requirement for correctness. Ideally, we would like to have a fully algorithmic method of software development, in which each step is predetermined and each decision can be reliably deduced from what has gone before. This can be achieved for certain problems that are very well understood. The designer of a small transformer, given a statement of the power and of the input and output voltages, can deduce the number and form of the necessary iron laminations for the core and the gauge and number of turns in the primary and secondary windings. The designer of a syntax analysis phase of a compiler can construct the parsing program directly from the specification of the grammar. The designer of a conventional batch program to update a serial master file from a serial transaction file can deduce the algorithm for matching the file records from a specification of the files themselves. But these are very simple problems. Their solution can be  and has been  automated, and it then ceases to form a significant part of the development activity. Our interest centres on those development decisions which are not apparently amenable to automation, both decisions in the specification and decisions in the implementation activities. We may regard a development method as a structuring of the development decisions. Here we mean ‘decision' in a wide sense, covering the explicit consideration and recording of any relevant fact or choice. It would, in this sense, be a decision in the specification AngInf82.doc 02/11/02 Page 6 stage of a program to generate prime numbers that the product of two primes cannot itself be a prime; it would be a decision in developing a data processing system for insurance that each policy must be renewed annually; it would equally be a decision that there should be a subsystem for claims and another for premiums, or that a particular subroutine should have certain parameters and should call certain common subroutines, or that the policy master records should be held in an index sequential file. Different decisions will have different characteristics. Some can be easily taken and carry little risk of error, especially where they record facts which are known a priori. Some will be error-prone, but have very limited impact on later decisions. Some, such as a decision to decompose a system into two subsystems, will have a very wide impact. Obviously, there are some general methodological principles which should govern the arrangement of development decisions into a development method. Decisions which record facts known a priori should be taken early, before decisions which record choices within the developer's discretion. High risk decisions should be postponed as long as possible, because it will be easier to make them correctly in a later than in an earlier stage of development. Decisions which have a wide impact should not, ideally, be highly error-prone. Error-prone decisions should be followed as soon as possible by consideration of anything which may invalidate them. Decisions about implementation should be taken after decisions about specification. From this point of view, it seems clear that ‘top-down' and ‘stepwise refinement' methods are to be studiously avoided. Using such a method, the developer begins by deciding the top-level structure of the system, decomposing it into its largest constituent parts. Undoubtedly, this is the worst possible decision to place at the beginning of development. It is a decision about the system itself, which, ex hypothesi, is not yet well-known: it is therefore highly error-prone. It has the widest impact of all those decisions within the developer's discretion, because it sets the context for all later structural decisions: if the top-level structure is wrong, it will not be easy to salvage much of the work done on lower levels. And, finally, if this first decision is wrong, it may not be invalidated until late in the development, We may suspect that developers who claim to be using top-down or stepwise refinements methods are, in fact, doing something quite different. The real work of development is done, informally and invisibly, in the developer's head, where something approximating to an outline of the complete system is visualised. This outline is then documented in a top-down fashion, and enough details filled in to complete the work. A brilliant, or even highly competent, developer may be able to work quite effectively in this manner, but its limitations are obvious [11]. 6 Maintenance and Complexity Many kinds of software system  perhaps most kinds  must be readily adaptable to changes in their specifications. We call this adaptation ‘maintenance', a usage which is unique to software engineering. In other branches of engineering, the word ‘maintenance' usually means the activity of guarding against or repairing the physical degradation which afflicts the product. Bridges must be painted to avoid corrosion; resistors must be examined and replaced if their value has strayed outside the specified tolerance; the oil in the engine sump must be drained and replaced; the potholes in the road must be filled in. Analogous activity is needed in software systems of some kinds: a database system may need periodic examination to detect and repair errors in chaining between records; an index sequential file may need reloading when too many insertions have reduced the access speed or filled the overflow area; a partitioned data set may need to be reorganised to make dead space available for new members. But we do not usually mean such things AngInf82.doc 02/11/02 Page 7 by ‘maintenance': we mean the activity of changing the system to satisfy changed specifications, much as the design of an aeroplane may be changed to give a ‘stretched' version, or a bridge may be widened to carry a greater load of traffic. It is well known that maintenance in this sense accounts for a large part of the expenditure on data processing systems. This need not be a cause for dismay or concern: we might congratulate ourselves because our products are so adaptable that our customers' needs can be satisfied at the lower cost of maintenance rather than the higher cost of complete redevelopment. But few software engineers, and even fewer customers, would join in the congratulations. The cost of maintenance is generally considered to be too high, in the sense that comparatively small changes in specification often require very difficult and expensive changes to the system. We might also consider the cost of the maintenance that is not carried out because it is too difficult or expensive for the customer to accept it at the offered price. We rarely get the opportunity to measure the cost of nonmaintenance. The source of excessive maintenance cost is disparity between the structure of the specification and the structure of the system. The change to the specification may be small, simple, and local; if the consequent change to the system is large, complex, and diffuse, that must mean that the structure of the system is different from the structure of the specification. Often, the system will also be excessively complex in itself, making any change difficult and even dangerous. Our chief tool for mastering and avoiding complexity is the ability to view a system as a relatively small number of relatively large parts, connected in a clear and simple way. If the cost of maintenance is to be low, then the parts and connections in the system must correspond closely to the parts and connections in the specification. At the same time, the parts and connections in the specification must correspond closely to the parts and connections in the customer's view of the problem domain. A fundamental difficulty in achieving this goal is the inevitable difference between the structure of a good specification and the structure of an efficient implementation. To return to the earlier example of the payroll system, the specification should be structured so that one of its distinct parts is a statement of what happens to one employee during the whole period of his employment; but an efficient implementation may require to contain a distinct part which is a weekly batch program dealing with the events that have affected all current employees during the past week. We cannot avoid the problem of mediating between these two different structures; the mechanised conversion from process to procedure form is a significant component in a solution to this problem. To the greatest possible extent, we should aim to take advantage of this tractability of the software medium, of the ability to transform one piece of software into another of related but different structure [12]. 7 Standard Products and Parts Most engineering products are highly standardised, and their designers work within narrowly defined bounds. Each product falls into a well-known type, and has a readily recognisable structure; the engineer is not expected to produce a revolutionary design, but rather a carefully crafted set of choices within the accepted parameters. The introduction of a significantly different structure for a product is a revolution, and regarded as potentially dangerous. After countless suspension and cantilever bridges has been designed and built, it was a daring engineer who conceived the box girder bridge. After sixty years of reciprocating internal combustion engines, it was a daring innovation to produce a car powered by a rotary engine. There are many reasons for this high degree of standardisation. One important reason is the visibility of the physical product, which we have referred to previously. The AngInf82.doc 02/11/02 Page 8 automobile manufacturer who first abandoned the separate chassis and body and adopted the integral design could hardly hope to keep the advance a secret from his competitors. Another reason is the widespread use which most physical engineering products receive. Huge numbers of people use automobiles of a particular design, or cross a particular bridge on their way to work, or ride in elevators of a particular design. And they use other automobiles and bridge and elevators, too. So there is a strong tendency towards uniformity in customer's expectations of a particular class of products. This tendency is much less marked in software, where many products are created for customers who have no experience of other similar products to guide their evaluation. Where usage of a class of product is widespread, as of Fortran compilers, there is a stronger tendency toward uniformity of expectation and hence of product. Yet another reason for standardisation of products is the use of standard parts. The manufacturer of a physical product is usually forced to buy many or even all of the parts he uses; these parts are bought from component companies who supply identical parts to the other manufacturers. In software, the developer always has the option  even if it is not necessarily the best option  of building everything in-house. At the 1968 NATO conference, M. D. Mcllroy, advocating the creation of a software components industry, complained: [13] “When we undertake to write a compiler, we begin by saying ‘what table mechanism shall we build?' Not ‘what mechanism shall we use?', but ‘what mechanism shall we build?' I claim we have done enough of this to start taking such things off the shelf.” A dramatic illustration of the way standardisation comes about is provided by the microcomputer industry. Where discussion about standard machine architecture had limped along for ten years in the mainframe and minicomputer industry, the microcomputer industry has standardised itself immediately, simply because very cheap standard CPU chips are available which the manufacturer can not produce in-house. This has scarcely happened in the software industry. There are a few examples of standard components or packages, such as libraries of mathematical routines, that were already visible in 1968. But remarkably little else. We may perhaps attribute the lack of standard components to both economic and technical factors. If a putative standard component is small, the cost of building it in-house may be no greater than the cost of identifying and obtaining the required item from the supplier. If it is larger, its specification is likely to be larger, and there is correspondingly less likelihood of finding what is needed in a supplier's catalogue. The technical factors are associated with the interface specification. Traditionally, the general-purpose software components contributed to an installation's program library by hopeful authors have been cast in the form of procedures. The procedure interface is very satisfactory for mathematical functions, but not for much else. The interface specified for the library routine always looks to the potential user to be arbitrary, difficult to understand, and impossible to reconcile with his existing design; so he writes his own version to his own interface specifications. We may note that one environment where general purpose software does seem to be widely used is the UNIX environment. Not surprisingly, a typical UNIX component is a sequential process communicating by message streams (pipes) with other processes; the message stream interface, for a wide class of application, is both convenient and wellsuited to the definition of standard software components. 8 Some Conclusions and Suggestions The use of the term software engineering expresses an aspiration, not a fact. The established branches of engineering are old; we are young. Their work is organised into well-defined specialisations; ours is still largely ill bounded and undifferentiated. Their AngInf82.doc 02/11/02 Page 9 products are standardised wherever possible; ours are most often built ad hoc. They examine and evaluate their products; we spend more time contemplating our navels. They have components industries; we do not. Their disastrous mistakes make headlines in the newspapers; ours are too commonplace to merit remark. Many of these differences, and many of our difficulties, flow from the intangible nature of the software product. It is hard for us to draw boundaries between different development stages and activities; it is hard to constrain the engineer's choices enough to make his work manageable without preventing him from building an efficient product. But this intangible nature of the product is our greatest advantage, and we have derived too little benefit from it. We can manipulate, rearrange, reconfigure, and transform a software product at a very low cost, by using the computer itself; we have not done so as we should. The crux of the matter is the relationship among three structures: the structure of the problem in the problem domain; the structure of the written specification; and the structure of the implementation. For a large class of problems, including many in data processing, process control, message switching, and embedded applications, the problem domain is sequentially ordered in time: the natural specification structure to capture the essence of the problem in its context is that of a set of communicating sequential processes. Where the number of these processes, their elapsed execution times in the real world, and the densities of their activities are well-fitted to the available machine, operating system, and programming language, the development of the system can proceed reasonably smoothly. The ideal development here is one in which the specification processes are programmed directly in a process-orientated programming language, and the resulting program can be executed directly on the machine. But often, especially in data processing applications, the configuration and dimensions of the specification process set are ill-fitted  even to the point of incompatibility  to the programming language and to the machine and its operating system. Present operating systems are conceived for the execution of procedures (whose execution is, conceptually, instantaneous), or of small sets of processes with short execution times and dense demands for machine cycles. A specification with 10,000 processes, each taking 50 years to execute, and demanding only 100 seconds of machine time over the 50 years, simply does not fit the machine. Traditionally, the incompatibility has been resolved by choosing between Scylla and Charybdis. Either the specification is cast in a problem orientated form which the customer can hope to understand, whereupon the structure of the design will bear no visible relation to the structure of the specification: or the specification is cast in designorientated form, whereupon the structure of the specification will bear no visible relation to the structure of the problem in its domain. We should cast the specification in the problem-orientated form (of course!), and we should derive the design structure (that is, the implementation structure) by systematic transformation. The explicit, process-structured, specification is itself a fully detailed executable text. But it needs to be rearranged and transformed before it can be executed efficiently. The computer should be the essential tool in this activity. The transformations should be carried out mechanically, so that the correctness of the specification is preserved; but they should be chosen by the engineer, interacting with the transformation system. Some transformations of the appropriate kinds have been studied and described; some have already been mechanised [14, 15]. But there is a long way to go before use of such transformations becomes widespread and well-supported. One benefit that might accrue from this approach to software development is a separation of languages. Our present languages are disgracefully complicated, as they must inevitably be, serving simultaneously the incompatible purposes of specification and implementation. In place of a programming language, we might have a pair of related languages: a AngInf82.doc 02/11/02 Page 10 specification language, ruthlessly purged of implementation features; and a language for directing the transformation of specifications into efficient implementations. Each language of the pair could be much simpler than today's programming languages. Along with this separation of languages would come a separation of development activity along more rational and intelligible lines. It would be easier to determine a boundary between the specification engineer and the implementation engineer, and fruitful specialisation might develop both within an organisation and between one organisation and another. Perhaps specialisation is both the precondition and the hallmark of a mature discipline of engineering. References [1] Software Engineering: Report on a conference sponsored by the NATO SCIENCE COMMITTEE; ed P. Naur and B. Randell; 1969. [2] The Preparation of Programs for an Electronic Digital Computer; M. V. Wilkes, D. J. Wheeler and S. Gill; AddisonWesley, 1951. Quoted in A History of Computing in the Twentieth Century; ed N. Metropolis, J. Howlett and G-C. Rota; Academic Press, 1980. [3] Hierarchical Program Structures; O-J. Dahl and C. A. R. Hoare; in Structured Programming; O-J. Dahl, E. W. Dijkstra and C. A. R. Hoare; Academic Press, 1972. See also Design of a Separable Transition-diagram Compiler; M. E. Conway, Comm ACM July 1963. [4] Cooperating Sequential Processes; E. W. Dijkstra; in Programming Languages; ed F. Genuys; Academic Press, 1968. [5] Message Passing Between Sequential Processes: the Reply Primitive and the Administrator Concept; W. M. Gentleman; Software Practice and Experience, May 1981. [6] Communicating Sequential Processes; C. A. R. Hoare; Comm ACM, August 1978. [7] The Design of Data Type Specifications; J. V. Guttag, E. Horowitz and D. R. Musser; in Current Trends in Programming Methodology, IV; ed R. T. Yeh; Prentice-Hall, 1978. [8] Structured Design; E. Yourdon and L. L. Constantine; Prentice-Hall, 1979. [9] Structured Programming; R. C. Linger, H. D. Mills and B. L Witt; Addison-Wesley, 1979. [10] Software Development: A Rigorous Approach; C. B. Jones; Prentice-Hall, 1980. [11] Programming as a Cognitive Activity; T. R. Green; in Human Interaction with Computers; ed H. T. Smith and T. R. Green; Academic Press, 1980. [12] Information Systems: Modelling, Sequencing and Transformations; M. A. Jackson; Proc 3rd international Conference on Software Engineering, 1978. [13] op cit [1]. [14] Some Transformations for Developing Recursive Programs; R. M. Burstall and J. Darlington; Proc International Conference on Reliable Software, 1975. [15] A System for Developing Programs by Transformation; M. S. Feather; PhD Thesis, University of Edingburgh, 1978. Angewandte Informatik 2/8 pp96-103 0013-5704/82/2 0096-08S 02.00/0 © 1982 Friedr. Vieweg & Sohn Verlagsgesellschaft mb AngInf82.doc 02/11/02 Page 11